Proteomics Impact on Cell Biology to Resolve Cell Structure and Function

The acceleration of advances in proteomics has enabled integration with imaging at the EM and light microscopy levels, cryo-EM of protein structures, and artificial intelligence with proteins comprehensively and accurately resolved for cell structures at nanometer to subnanometer resolution. Proteomics continues to outpace experimentally based structural imaging, but their ultimate integration is a path toward the goal of a compendium of all proteins to understand mechanistically cell structure and function.

The acceleration of advances in proteomics has enabled integration with imaging at the EM and light microscopy levels, cryo-EM of protein structures, and artificial intelligence with proteins comprehensively and accurately resolved for cell structures at nanometer to subnanometer resolution.Proteomics continues to outpace experimentally based structural imaging, but their ultimate integration is a path toward the goal of a compendium of all proteins to understand mechanistically cell structure and function.
The near-complete characterization of proteins in cell structures has been propelled by quantitative proteomics linked to quantitative imaging of the proteins at the EM level supplemented by artificial intelligence (AI).Today, the technical hurdles are being overcome (1) because of pioneering efforts over decades to establish a pipeline as indicated in the graphical abstract.
For Cell Biology, progress from the compendium of cell structures as visualized by EM from pioneers of the last century (2) to progress in proteomics to detailed mechanisms of function through a comprehensive visualization of proteins on the same cell structures is noteworthy.Selected examples include the nuclear pore complex, motile cilia, the centrosome, mitotic chromosomes, and the ribosome.The studies have approached and, in some cases, attained the goal of a near-comprehensive, accurate, and permanent compendium of the proteins in cell structures that have also enabled their mechanism of function to be resolved.

THE CAP GOAL
It was Sydney Brenner who proposed the CAP (Comprehensive, Accurate, and Permanent) criteria indicated in the graphical abstract for large-scale data (3).CAP criteria had been demonstrated by Brenner's mapping of the synaptic connections (the connectome) by serial section EM of the 302 neurons in the nematode Caenorhabditis elegans in 1986 (4).The task took 14 years with over 10,000 electron microscopic thin serial sections imaged and mapped manually.This work remains the basis for the yet unsolved connectome of the vertebrate brain (5).
Today, the CAP goal appears to have been reached for the nuclear pore complex of eukaryotes.The early EM description of a nucleus (Fig. 1) has progressed to a near-complete understanding of the protein makeup and an increasingly deep understanding of the mechanism of function of the nuclear pore harboring the entry and exit points of the nucleus (Fig. 2).The image of Figure 1 may be contrasted with the molecular microscope resolution of the nuclear pore complex today in Figure 2.

THE NUCLEAR PORE COMPLEX
The elucidation of the nuclear pore complex has emerged from more than 30 years of progress.
The strategy of a pipeline consisting of sample preparation, protein characterization, stoichiometry, protein partners, and complete structural elucidation including the use of AI complemented with in situ imaging deduced a model for the mechanism of function.Taken together, the work from yeast to humans has extended considerably how the challenges previously raised for integrating proteomics data for cell biology (6, 7) are available to cell biologists (1).
For yeast, it is at the level of sample preparation on which a pyramid of discovery rests.In 1993, Rout and Blobel (8) isolated the nuclear pore complex from the model organism, budding yeast.EM confirmed an eightfold symmetry of the isolated yeast nuclear pore structure that has stood the test of time.Most of the 80 proteins they uncovered that coisolated with the nuclear pore complexes were unknown at the time.
Progress in the structural and functional characterization of the nuclear pore complex came from an integrative combination of multiple orthogonal approaches as a conceptual "molecular microscope" (9).Mass spectrometry enabled the characterization of proteins in the isolated nuclear pore complex (10).Of the total of 174 proteins identified, 30 were concluded to be genuine constituents of the nuclear pore complex and confirmed as nucleoporins.
Stoichiometry was assessed by quantitative immunoblotting of the tagged proteins to estimate three categories of FIG. 2. 3D structure of the yeast nuclear pore complex.This image reveals the nuclear view (A), the tilted nuclear view (B), the pore membrane view (C), and the half-ring view (D).Adapted from Akey et al., 2023 (6).
Proteomics Impact on Cell Biology 2 Mol Cell Proteomics (2024) 23 (5) 100758 nucleoporins: at 8, 16, or 32 copies per nuclear pore complex.The characterization of protein-protein interactions in the nuclear pore complex and further assessment of stoichiometry by mass spectrometry followed for yeast (11,12) and independently by Beck et al. (13,14)for human nuclear pore complexes.
Exhaustive optimization of affinity purification protocols enabled complexes in yeast to be characterized that also could be assessed for imaging (15)(16)(17)(18)(19).With cryogenic samples, microscopy defined the spatial location of each protein of the nuclear pore complex (12,20).Protein crosslinking with disuccinimidyl suberate and separation by SDS-PAGE resolved the proximal residues of interacting proteins or neighboring proteins based on tandem mass spectrometry of tryptic digests of cross-linked proteins separated by SDS-PAGE (20).This provided "NMR-like" distance restraints between nucleoporins with 3077 unique cross-linked pairs of residues that aided the calculation of the unambiguous spatial molecular architecture of the complex.Charge detection mass spectrometry was used to determine the total mass of the tagisolated nuclear pore complex.A mass of 52 MDa was determined for the isolated nuclear pore complex, which increased to 87 MDa when membrane proteins, cargo proteins, and transport factors were incorporated.
The detailed cross-linking studies in yeast complemented and extended prior independent studies using proximitydependent biotin identification (BioID) in human cells that did not require prior isolation of nuclear pore complexes (21).More complexes were characterized by BioID than observed at that time for cross-linking mass spectrometry of isolated human nuclear pore complexes with 17 confidently assigned for the latter (22).However, the labeling radius for proteins able to be characterized by mass spectrometry after BioID pull down was measured as ca. 10 nm rather than the higher resolution ca. 2 nm by crosslinking.
Yeast studies were further extended by using cryogenic samples to isolate nuclear pore complexes and visualization by electron cryotomography.The integration of images from different tilt angles of the cryogenic nuclear pore complexes in the EM generated a 3D tomogram (23).Using the accurate localization of all cross-linked residues of the yeast nuclear pore complex as determined by mass spectrometry, all 552 proteins in the nuclear pore complex were resolved (24).
Protein structures solved by cryo-EM were supplemented by AI (25).AlphaFold2 assigned missing structures with the images coherent with the stoichiometry from quantitative mass spectrometry to define the complete structure, accurately, and comprehensively.This included rare yeast nuclear pore complexes with double outer rings (25).Double outer rings are characteristic of vertebrate nuclear pore complexes (22).
The detailed structure (Fig. 2) in yeast ( 25) confirms the architecture of the earlier observed eightfold symmetrical structures of the nuclear pore complex (8).This now includes how the eight spokes link the nuclear pore complex to the nuclear envelope.The spokes and associated inner and outer rings are linked to the central transporter filled with nucleoporin FG (phenylalanine, glycine) repeat proteins that are interspersed with abundant transport factors.The entire nuclear pore complex reveals an overall structure reminiscent of how a suspension bridge enables high-volume transport while resisting stress (24).

A MECHANISM OF FUNCTION OF THE NUCLEAR PORE COMPLEX
For Cell Biology, one goal is to solve the molecular mechanism of function.It is through the molecular microscope (graphical abstract) that this may have been achieved for the nuclear pore complex.The yeast nuclear pore complex is about 100 nm in diameter with a center of about 50 nm diameter representing the central transporter (Fig. 3) (26).The central transporter prevents the passage of nonspecific macromolecules with metabolites and ions small enough to passively diffuse through the nuclear pore.
An early model deduced from the proteomics-based molecular microscope proposed a "virtual gate" to explain how the protein constituents of the central transporter regulate access to or exit from the nuclear interior (10).The virtual gate mechanism is now understood as an entropic/enthalpic barrier of the central transporter through the action of the FG-repeat nucleoporins.Although intrinsically disordered proteins, they are now understood to be in an extended conformation (27).
The central transporter is packed with nuclear transport receptors and their protein or RNA cargo as deduced by quantitative mass spectrometry using BioID by the Beck group (28).Together with the FG-repeat nucleoporins, they form a barrier to prevent cargo not associated with cargo carriers from entering or exiting through the nuclear pore complex.Transport is enabled by the offset of entropy and enthalpy.The entropy component is derived from the restriction of cargo carriers within the small channel, further constrained by the dynamic motion of intrinsically disordered FGrepeat nucleoporins, randomly batting out unsolicited cargo.FIG. 3. Schematic of the nuclear pore complex of budding yeast showing the central transporter filled with the intrinsically disordered FG nucleoporins that select nuclear transport receptors with associated cargo for import and export across the nuclear pore.Reproduced from Cowburn and Rout, 2023 (23).FG, phenylalanine, glycine.

Proteomics Impact on Cell Biology
Mol Cell Proteomics (2024) 23( 5) 100758 3 Enthalpy is derived from nuclear transport receptors that use the enthalpy associated with binding to FG domains to carry their cargo across the channel by hopping between FG repeats within and between nucleoporins.The FG proteins interact with several sites on transport receptors with the latter hopping on the FG-repeat proteins with an on/off time scale of microseconds (26).
The Beck group has studied extensively the vertebrate nuclear pore complex.It has a similar inner ring but differs from that in yeast by having two outer rings while retaining similar architectural principles to the nuclear pore complex of yeast (22).The proteomics of isolated nuclear pore complexes from mammalian sources provided a parts list that was quantitative (29).This could now be extended to the use of integrated datasets from proteomics and EM cryotomography to elucidate the human nuclear pore complex (22).This study also utilized cross-linking mass spectrometry to characterize protein complexes followed by their visualization through cryoelectron tomography of isolated nuclear envelopes.It is the matching of protein structures with complexes visualized based on the cross-linking mass spectrometry studies (30) that resulted in the resolved nuclear pore complex (22).Additional phosphoproteomics studies (22) revealed insight into how the nuclear pore complex is assembled.
Further studies using the integration of data from cryoelectron tomography and mass spectrometry provided a more comprehensive elucidation of the human nuclear pore complex (31).The study gave insight as to how different concentric rings of the nuclear pore complex can be assembled from the same building blocks as well as documenting their molecular interactions.As well, insight into the relevance of protein phosphorylation and the onset of mitosis was shown.
The integrated data were extended further for the human nuclear pore complex now with the additional application of AI-based structure determinations from AlphaFold2 and Rose TTA fold (32).In this way, the structure was resolved for most proteins of the cytoplasmic ring as well as the nuclear and inner rings to near-atomic resolution.Insight into the mechanism of nuclear pore transport by the FG domain proteins, the anchoring mechanism of the nucleoporin scaffold proteins to the membrane of the nuclear envelope, and the membrane conformation around the pore was provided.Remarkably, conformational differences in the nuclear pore complex were resolved for nuclear pore constriction and widening (33).
The key role of proteomics has led to the elucidation of the structure and function of the nuclear pore complex through the molecular microscope (9) that has satisfied the Brenner criteria of the CAP principle for the yeast nuclear pore complex and near completion of the human nuclear pore complexes.The independent studies of yeast (Fig. 2) and human nuclear pore complexes (32) revealed the major differences between mammalian and yeast nuclear pore complexes while sharing a common mechanism of transport into and out the nucleus.It is the application of an integrated strategy of data from proteomics and molecular imaging now with the advance of AI that has propelled the field.

MOTILE CILIA
Progress in attaining the CAP principle has been made by studies on motile cilia.Motile cilia (Fig. 4) are essential structures in oviducts, respiratory epithelia, brain ventricles, and sperm flagella.As for the nuclear pore, model organisms have been key and have been complemented by studies on human motile cilia, including human sperm flagella.
The motility of cilia and flagella is due to the microtubulebased structure known as the axoneme.Progress in the resolution of the protein makeup and visualization at nanometer resolution in motile cilia has also benefitted from model organisms.These include Chlamydomonas rheinhardtii, which has biflagellate cilia (34) and conserves the two central singlet microtubules surrounded by nine associated doublet microtubules also for sperm flagella as visualized by EM from the pioneers in cell biology (Fig. 5).From the proteomics of the central apparatus of the singlet microtubules and associated proteins, cryo-EM of Chlamydomonas central apparatus followed (35).The high-resolution study revealed conservation among species and insight into the molecular regulatory mechanism for ciliary beating.For the doublet microtubules, a comprehensive proteomics characterization of the ciliate Tetrahymena thermophila using crosslinking of cilia in situ identified proteins of the microtubule inner proteins of the inner wall of the doublet microtubules (36).A further study with crosslinking-based proteomics, BioID, pull-down assays, cryo-EM, cryo-electron tomography, and AI-based AlphaFold2 solved a model for the nexin-dynein regulatory complex (37).
The achievement of CAP criteria has enabled insight into the motile cilia of flagella of C. rheinhardtii and the cilia of human respiratory epithelium by Walton et al. (38).The molecular mechanisms for regulating ciliary motility have been deduced with a molecular explanation for human patients suffering from primary ciliary dyskinesia.The isolation of axonemes followed by cryo-EM and integration with cryoelectron tomography data and AI has visualized the 9 + 2 structure of the axoneme of C. rheinhardtii at near-atomic resolution (Fig. 6).This was extended to human cilia including those from patients with mutations leading to ciliary dyskinesia.A regulation of ciliary beating with a proximal to distal waveform was proposed with a mechanical signal via the dynein regulatory complex (38).
From these model organisms that extend to humans and human patients, a near-atomic resolution of the doublet microtubule of the motile cilia also revealed the filamentous microtubule inner proteins to understand how ciliary waveforms are regulated during ciliary beating to propel cell and fluid movement (38)(39)(40).

THE CENTROSOME
The centrosome is the major microtubule-organizing center in animal cells (41,42).The structure has a mother-anddaughter centriole that establishes the axoneme for motile flagella as well as nonmotile cilia.During the cell cycle, mother and daughter centrioles are duplicated and segregated to ensure a mother-daughter pair in each new cell after mitosis.A cilium is generated in G1 or G0 via the mother centriole's distal appendage proteins that are absent from the associated daughter centriole.The mother centriole distal appendages are associated with the apical surface of polarized epithelial cells.The seeding and assembly of the axoneme for nonmotile cilia formation are generated and supported by the base mother centriole in association with the daughter centriole.The mother centriole is known as the basal body of the cilium.Fibroblasts use a vesicular transport mechanism of mother centriole association via the distal appendage proteins to target the cell surface with membrane fusion and cilia formation from the basal body.
A recent proteomics compendium of proteins of the centrosome has applied a pull-down variation (43) of BioID ( 44) that has been used extensively for proteomics.Named CAPture (centrosome affinity capture), the method is directed at centrosome isolation from cell lysates (43).A synthetic peptide of 33 amino acids was selected from the protein variable flagellar protein 3, also named CCDC61, coupled to biotin.After incubation with lysates, magnetic beads coupled

Proteomics Impact on Cell Biology
Mol Cell Proteomics (2024) 23( 5) 100758 5 to streptavidin were used to isolate centrosomes.Protein characterization by LC-MS-MS of tryptic digests was followed with tandem mass tag labeling quantification.The comprehensive results extended past proteomics studies of centrosomes.
Insight into the mechanism of assembly of the proteins of subdistal appendages onto daughter centrioles to become mother centrioles was also demonstrated.Here, proteomics and CRISPR-Cas9 ablation of genes for specific proteins of the distal appendages of centrioles were used to map the hierarchy of distal appendage protein assembly that converts a daughter centriole into a mother centriole for ciliogenesis.The assembly map enables a molecular basis to understand how cells distinguish daughter from mother centrioles with cilia formation from only the latter.
Flagella of mammalian sperm have also been studied with an integrative approach, combining information from cryoelectron tomography, quantitative proteomics, and AI with AlphaFold2 for atomic resolution of proteins.The authors indicate this integrative approach as the virtual microscope (45,46).It remains the proteomics of flagella based on prior work that is the foundation of the virtual microscope, for example (47)(48)(49).In this way, atomic models of motile cilia have been resolved (50).
The work has also extended the cryo-electron tomography and proteomics studies of the complete proteome of nonciliated thymus centrosomes (51).Here as in several types of dividing cells but not all, the centrosome organizes the mitotic spindle as visualized by pioneers in cell biology (Fig. 7).The paired centrioles assure accurate chromosome segregation during mitosis as well as the delivery of centrioles to new cells after mitosis (52).The association of centrosome microtubules with chromosomes is via the kinetochore on centromeric DNA of paired chromosomes.

MITOTIC CHROMOSOMES
Mitotic microtubule-associated proteins were characterized by proteomics in 2001.Following the cell-free polymerization of microtubules in mitotic extracts of HeLa cells, and separation through discontinuous sucrose gradient centrifugation, protein separation by SDS-PAGE, and tandem mass spectrometry of tryptic peptides enabled protein characterization.Of the 15 proteins uncovered, astrin, a previously unreported coiled-coil protein, was found to localize to kinetochores with chromosomes aligned at the metaphase plate (53).In 2005, human metaphase chromosomes (shown here by EM from past cell biology pioneers, Fig. 8) were isolated from synchronized cells, with protein separation by SDS-PAGE or 2D gels with bands excised, digested, and tandem MS by MALDI TOF/TOF mass spectrometry with 158 proteins characterized including eight subunits of condensin I and II complexes (54).
In 2010, a methodology of protein multiclassifier combinatorial proteomics revealed 4000 proteins in isolated mitotic chromosomes (55).Random Forest-based machine learning then integrated the classifiers with a separate bioinformatics classifier.The authors tested 50 previously uncharacterized proteins found in mitotic chromosomes with 34 predicted by the multiclassifier to be on mitotic chromosomes with 30 confirmed by GFP tagging expression and microscopy.All known centromeric subcomplexes were thus identified but also an additional 110 not previously known as kinetochore associated.The multiclassifier strategy for proteomics has approached a comprehensive, accurate, and permanent characterization of proteins of mitotic chromosomes.
A further extension of the strategy was the application to the SMC, the structural maintenance of chromosome protein complexes.As before, mitotic chromosomes were isolated from nocodazole-treated DT40 cells.Stable isotope labeling by amino acids in cell culture comparisons of proteins were characterized by tandem mass spectrometry of wildtype and cells with conditional KO of SMC2, CAP-H, CAP-D3, or SMC5 to test for the dependence on members of the condensin, cohesin, and SM5 complex on proteins associated with mitotic chromosomes.Here, nano Random Forest machine learning based on the previous multiclassifier with combinatorial proteomics was used to integrate the proteomics datasets (56).The nano Random Forest strategy concluded that 113 of the 5038 proteins characterized were required for chromosome structure and segregation with one third of previously known.Proteins linked to kinetochore function were tested by siRNA and GFP-tagged proteins expressed and visualized in mitosis on mitotic chromosomes in cells.The study established the proteins functionally associated with mitotic chromosomes regulated by condensin I, condensin II, cohesion, SMC5/6, and Scc2/4 as well as their interdependencies (57).Proteomics has established a comprehensive, accurate, and permanent dataset of proteins needed for chromosome shape in mitosis.These are regulated by condensin I, condensin II complexes, the cohesin complex, chromokinesin KIF4A, and topoisomerase II alpha (58).
Protein structures from cryo-EM are now giving insight into kinetochore architecture (59).For the structural maintenance of chromosomal protein complexes, condensin, cohesin, and SMC5, 3D protein structures have revealed a common 40 nm diameter ring with coiled coil SMC proteins with insight into the mechanism of DNA loop extrusion (reviewed in Ref. (60)).It is remarkable, that the AI of AlphaFold to compare cohesin subcomplexes from different species enabled an understanding of mutations not previously interpretable (61).Also, insight into ternary complexes and the quaternary complex of Wapl, Pds5, SA/Scc3, and Scc1 were predicted.The AI-based structures have given impetus to study previously unknown interactions between cohesin subunits, how cohesin extrudes DNA loops, how cohesin's exit and entry gates are regulated, and cohesin dissociation from chromosomes with several predictions for mechanisms that are under test experimentally (61,62).

THE RIBOSOME
The aforementioned sample is of selected discoveries that have attained CAP or a near CAP compendium of proteins to resolve cell structure and mechanistic function.An earlier effort was achieved for the ribosome.First visualized by EM by Palade in 1955 (63), proteins were later characterized after ribosome isolation, protein separation and purification from 2D gels, and protein crosslinking and immuno-EM for bacterial ribosomes (64).Functional reconstitution followed for bacterial ribosomes and extension to the characterization of proteins in ribosomes from budding yeast as a model eukaryote (65).Further advances continued using proteomics (e.g., (66,67)), and integration with cryo-EM (68).The complete solution of ribosome structure was attained with mechanistic insight into function (e.g., (69)(70)(71)(72)).The proteomics of ribosomes continues to generate discoveries (e.g., (73)) as does imaging by time-resolved cryo-EM (74).For the latter, cryo-EM has resolved GTP-dependent structural intermediates with a subnanometer spatial resolution and a time resolution of milliseconds.
It is the eventual application of the molecular/virtual Microscope as indicated in the graphical abstract that will resolve the molecular anatomy and functional mechanism of each

Proteomics Impact on Cell Biology
Mol Cell Proteomics (2024) 23(5) 100758 7 structure, organelle, compartment of the cell, and all interaction domains between and among cell structures as based on the foundation of proteomics.For Cell Biology, the progress from EM descriptions of cell structures (2) to the proteomics-based molecular microscope has been noteworthy.Such advances with a CAP-assured characterization of proteins to integrate with imaging will resolve all cell structures and provide mechanistic insight into function (95,96).
2024, Mol Cell Proteomics 23(5), 100758 Crown Copyright © 2024.Published by Elsevier Inc on behalf of American Society for Biochemistry and Molecular Biology.This is an open access article under the CC BY license (http://creativecommons.org/ licenses/by/4.0/).https://doi.org/10.1016/j.mcpro.2024.100758Proteomics Impact on Cell Biology to Resolve Cell Structure and Function John J. M. Bergeron *

FIG. 1 .
FIG. 1.A plasma cell nucleus is encompassed by the nuclear envelope in the cytoplasm of the cell.Reproduced from Fawcett 1981 (Fig. 114) (2).

FIG. 5 .
FIG. 5. EM of the axoneme and associated fibers in the principal piece of a spermatozoan flagellum showing the protofibrils in the walls of the two central singlet microtubules and of the surrounding nine doublet microtubules.The cortex and medulla of the outer fibers are also clearly differentiated.Reproduced from Fawcett 1981 (Fig. 336) (2).FIG. 6. Cross-section of the axoneme from flagella of Chlamydomonas reinhardtii from Fig. 1B of Walton et al. (38).The two central singlet microtubules (C1, C2) surrounded by the nine associated doublet microtubules (DMTs) are indicated.A comparison with the image of sperm flagella (Fig. 5) reveals the noteworthy advance with detailed protein structures extending the basic features of the EM imaging of the axoneme of the last century.

FIG. 8 .
FIG. 8. High-voltage EM of CHO cell metaphase chromosome prepared in the absence of calcium.Lower, CHO metaphase chromosome isolated in the presence of calcium.Reproduced from Fawcett 1981 (Figs. 128 and 129) (2).CHO, Chinese hamster ovary cell line.