Global Survey of Protein Expression during Gonadal Sex Determination in Mice*

The development of an embryo as male or female depends on differentiation of the gonads as either testes or ovaries. A number of genes are known to be important for gonadal differentiation, but our understanding of the regulatory networks underpinning sex determination remains fragmentary. To advance our understanding of sexual development beyond the transcriptome level, we performed the first global survey of the mouse gonad proteome at the time of sex determination by using two-dimensional nanoflow LC-MS/MS. The resulting data set contains a total of 1037 gene products (154 non-redundant and 883 redundant proteins) identified from 620 peptides. Functional classification and biological network construction suggested that the identified proteins primarily serve in RNA post-transcriptional modification and trafficking, protein synthesis and folding, and post-translational modification. The data set contains potential novel regulators of gonad development and sex determination not revealed previously by transcriptomics and proteomics studies and more than 60 proteins with potential links to human disorders of sexual development.

The development of an embryo as male or female depends on differentiation of the gonads as either testes or ovaries. A number of genes are known to be important for gonadal differentiation, but our understanding of the regulatory networks underpinning sex determination remains fragmentary. To advance our understanding of sexual development beyond the transcriptome level, we performed the first global survey of the mouse gonad proteome at the time of sex determination by using two-dimensional nanoflow LC-MS/MS. The resulting data set contains a total of 1037 gene products (154 non-redundant and 883 redundant proteins) identified from 620 peptides. Functional classification and biological network construction suggested that the identified proteins primarily serve in RNA post-transcriptional modification and trafficking, protein synthesis and folding, and post-translational modification. The data set contains potential novel regulators of gonad development and sex determination not revealed previously by transcriptomics and proteomics studies and more than 60 proteins with potential links to human disorders of sexual development.

Molecular & Cellular Proteomics 8:2624 -2641, 2009.
The reproductive success of all animal species depends on correct development of the sexual organs, in particular the gonads. Sex determining region of Chr 1 Y (Sry), a gene residing on the Y chromosome, is the linchpin in sexual fate determination of the bipotential gonad: its presence initiates a cascade of molecular events leading to testis formation in an XY individual, whereas its absence or dysfunction typically results in ovary formation (for reviews, see Refs. 1 and 2). Although much effort has been dedicated to deciphering the molecular events responsible for gonadal sex determination, only a handful of genes have been conclusively linked to this process.
In mice, the gonadal primordia, also called the genital ridges, can first be seen as thin layers of cells lying on the surface of the mesonephroi around 10 days postcoitum (dpc), and primordial germ cells (PGCs) begin to populate the genital ridges of both sexes soon after. The gonads remain visibly indistinguishable between the sexes at 11.5 dpc, but just half a day later the male genital ridge begins rapid morphological changes. Sertoli cells begin to differentiate and cluster with germ cells in the male gonad. Sertoli cells then induce differentiation of other testicular cell types including steroidogenic Leydig cells in the interstitium, peritubular myoid cells that encapsulate the testis cords, and endothelial cells that form the male-specific vasculature (for a review, see Ref. 3). Somatic cells also begin to instruct the germ cells to undergo mitotic arrest by 14.5 dpc rather than entering meiosis at 13.5 dpc as they are induced to do in the ovary (4,5). By 12.5 dpc the testis is organized into cords and interstitial compartments, whereas the ovary has no definable structure at this stage.
The timing of these events in testis differentiation correlates with the activity of Sry and its target gene, SRY box containing gene 9 (Sox9). In mice, Sry is active in the Sertoli precursor cells from 10.5 to 12.5 dpc. Expression of SRY up-regulates Sox9 expression beginning around 11.5 dpc; Sry and Sox9 are both necessary and sufficient for testis determination (6 -11), implying that one or both genes activates a suite of targets in the testis-determining pathway. However, the elements of this pathway and their regulatory inter-relationships are largely unknown. Furthermore, several genes exhibit ovary-specific expression as early as 11.5 dpc, indicating that, at a molecular level, ovarian differentiation has in fact begun (12)(13)(14)(15). Nevertheless, our understanding of the control of ovary differentiation remains limited.
To identify molecules expressed while the gonads are undifferentiated and others that are involved in the initial phase of sex determination and sex differentiation, we analyzed protein expression in mouse gonads during the critical developmental time frame (11.5-12.0 dpc) when sex-specific molecular and morphological events have just been initiated. To date, the most comprehensive molecular investigations of male and female embryonic gonads have been performed at a transcriptional level. These screens, involving mRNA differential display (16,17), cDNA library screening followed by whole-mount in situ hybridization (18), suppression subtractive hybridization (19 -22), and microarray analysis (23)(24)(25)(26), have broadened the scope of potential candidate genes involved in gonad and PGC development and sexual differentiation. However, there has been no thorough investigation of the embryonic gonads at the protein level during the critical developmental window of sex determination, leaving a dearth of knowledge surrounding post-transcriptional events such as alternative splicing, post-translational modifications such as phosphorylation and glycosylation, and protein folding as well as inter-and intracellular interactions that may be occurring in the developing gonads at this time.
Here we used two-dimensional (2D) nanoflow LC-MS/MS to address these issues. 2D LC-MS/MS is a semiautomated method that has proven to be an effective and robust technique for rapid, large scale proteomics analyses (27)(28)(29). We utilized the resolving power of 2D LC-MS/MS to generate the first comprehensive proteome data set of embryonic gonads, providing a molecular description of gonadal differentiation at the protein level.

EXPERIMENTAL PROCEDURES
Materials-Chemicals were purchased from Sigma-Aldrich at the highest research grade with the exception of chloroform and methanol from Fronine (Riverstone, New South Wales, Australia), urea from ICN Biochemicals (Solon, OH), and trypsin from Promega (Annandale, New South Wales, Australia).
Animals-Institutional and state ethical approval was secured for the use of all mice in this research program. Embryos were collected from timed matings of the Swiss Quackenbush outbred strain with noon of the day on which the mating plug was observed designated 0.5 dpc. Sexing of embryos at these early developmental stages was done using the X-linked enhanced green fluorescent protein marker transgenic mouse line (30), which was a kind gift from Andras Nagy.
Embryonic Gonad Collection and Preparation of Protein Extracts-Approximately 600 gonad pairs (with mesonephroi removed) were resected from male and female mouse embryos at the ages of 11.5 and 12.0 dpc for use in the proteomics screen (gonads from each sex were kept separate throughout the proteomics analysis). Following collection in ice-cold PBS, tissues were washed in 250 mM sucrose with 10 mM Tris, and proteins were then extracted using 7 M urea, 2 M thiourea, 4% CHAPS, and 30 mM Tris. After 1-h incubation on ice (vortexing every 10 min), the supernatants were collected by centrifugation at 10,000 ϫ g for 10 min at 4°C and preserved as aliquots at Ϫ80°C for subsequent experiments. The protein content of the supernatants was determined using an Ettan TM 2-D Quant kit according to the manufacturer's instructions (GE Healthcare).
Before proteomics analysis, the gonadal extracts were reduced with 2 mM dithiothreitol on ice for 30 min, then alkylated with 5 mM iodoacetamide on ice for 30 min, and precipitated with methanol/ chloroform to remove salts and detergents (31). Protein pellets were resuspended in 50 mM ammonium bicarbonate containing 40 g/ml trypsin to a ratio of 50:1 (w/w) and digested overnight at 37°C. Tryptic peptides were dried using a vacuum concentrator (SpeedVac, Quantum Scientific, Lane Cove West, New South Wales, Australia) at 60°C for approximately 2.5 h and resuspended in 10 l of ion exchange (IEX) buffer A containing 20 mM citric acid in 25% ACN, pH 2.65. Totals of 250 and 300 g of extracted protein from embryonic gonads, corresponding to five female and six male replicates of 50 g, were used in the proteomics analysis.
Peptide Separation by Two-dimensional Liquid Chromatography-The Ettan multidimensional LC system from GE Healthcare, described by Yang et al. (32) was used to purify, desalt, and separate tryptic peptides prior to on-line MS and MS/MS analysis. The system was controlled by Finnigan TM Xcaliburா software (version 3.2, Thermo-Scientific, Waltham, MA) in high throughput, dual nanoflow mode and configured for off-line fractionation for improved identification rate and sequence coverage.
First dimension peptide prefractionation was performed by strong cation exchange (SCX) using a Thermo-Scientific IEX column (Bio-Basicா SCX, 5 m, 30 ϫ 0.32-mm internal diameter) at a consistent flow rate of 200 l/min. A linear salt gradient of 0 -60% IEX buffer B, which consisted of IEX buffer A plus 1 M NH 4 Cl, was applied for approximately 30 min. Throughout the separations, 250-l fractions were collected using a fraction collector, Frac-950 (GE Healthcare). A total of 20 fractions for each embryonic gonadal sample replicate was collected.
For second dimension peptide separation by nanoflow reversed phase (RP) LC, each fraction was vacuum-centrifuged to dryness and then redissolved in 5 l of RP LC buffer A (0.1% formic acid). Half of this volume was injected onto the nanoflow RP LC system equipped with two Agilent (Santa Clara, CA) trap columns (Zorbax TM 330SB C 18 , 5 m, 5 ϫ 0.3-mm internal diameter) and two Agilent RP LC columns (Zorbax 300SB C 18 , 3.5 m, 15 ϫ 0.075-mm internal diameter). Peptides were eluted from a trap column onto an analytical RP LC column for high resolution separation, which was performed at a flow rate of 200 nl/min by applying a 50-min linear gradient of 0 -60% RP LC buffer B, consisting of buffer A plus 84% ACN. The two trap-RP LC column assemblies alternated between equilibrating with buffer A and running sample to allow a continuous flow of sample through the system.
Mass Spectrometry-Mass spectrometric analysis of eluted embryonic gonadal peptides was performed on a linear trap quadrupole system (Thermo-Scientific) equipped with a nano-ion spray interface for on-line coupling to the Ettan multidimensional LC system. An ESI source needle (30 m tip) (Proteomass, Victoria, Australia) was used with a needle voltage of 1.6 kV in positive ion mode.
The mass spectrometer was operated in data-dependent mode, automatically switching between MS and MS/MS acquisition. Each full MS scan, collected in profile mode, was followed by MS/MS scans, collected in centroid mode, of the three most intense peaks in the MS spectrum with dynamic exclusion set to 25 s after one occurrence.
Analysis of MS/MS Data-For protein identification the derived mass spectrometric data sets were converted to TurboSequest TM (Thermo-Scientific) generic format (*.dat) files using the Bioworks Browser (version 3.2). The files were then searched against the International Protein Index (IPI; Ref. 33) database (version 3.19, released July 12, 2006, European Bioinformatics Institute) containing the forward sequences of all 51,252 proteins in the data set using the TurboSequest search algorithm (version 3.2). A decoy database was prepared by reversing the sequence of each entry and rerunning the searches. The species subset was set as Mus musculus. Oxidized methionine was set to differential modification. The number of allowed missed cleavages was set to 1.0. Peptide tolerance was set to 1.0, and the intensity threshold was set to 100. The parent ion selection was set to 1.4 Da with fragment ion tolerance set to 0.7 Da. Where possible a protein hit was based on at least two identified peptides from the same molecule. For gene product assignments based on two unique spectra we used a minimum Xcorr value of 2.5 for ϩ1 and ϩ2 peptides and 3.0 for ϩ3 peptides. For gene products identified with only a single peptide hit, the more stringent criteria of a minimum Xcorr of 2.9 for ϩ1 and ϩ2 peptides and 3.2 for ϩ3 peptides was applied. Random singletons were manually validated to ensure accurate MS/MS analysis.
Peak lists were generated using the TurboSequest algorithm. To obtain an identification error rate or false positive rate (FPR), we ran a reversed database using the same filter criteria as above. The FPR was calculated to be Ͻ1% using the following equation: FPR ϭ number of false peptides/(number of true peptides ϩ number of false peptides) ϫ 100. In our total protein list, male and female gonadal protein hits have been combined, and all redundancies have been included.
Bioinformatics-All identifiable proteins that conformed to the above filtering criteria were subjected to various bioinformatics analyses. Protein subcellular localization, functional type, molecular role, and biological pathway analyses were performed using Ingenuityா pathway analysis (IPA) software (Ingenuity Systems). Fischer's exact test was used to calculate a p value determining the probability that functions or interactions assigned to the protein data set are due to chance alone. Chromosomal mapping and gene tissue expression were determined by searching the Unigene mouse expressed sequence tag (EST) library (National Center for Biotechnology Information). Protein names, IPI numbers, gene symbols, and gene identifiers (Affymetrix microarray probe numbers) were aligned either manually or using the Martview function in the data mining browser BioMart (version 0.7, European Bioinformatics Institute and Ontario Institute for Cancer Research) to allow for comparisons between proteomics studies and between proteomics and transcriptomics studies, respectively. Information regarding protein/gene expression, mutant phenotypes, and association of proteins with known disorders of human sexual development was obtained using the literature search engine PubMed (National Center for Biotechnology Information) and the Mouse Genome Database available at Mouse Genome Informatics (MGI; September 2008 -February 2009, The Jackson Laboratory).

Identification of Proteins Expressed in Embryonic Gonad at the Time of Sexual Differentiation
The starting materials for this study were genital ridges from mouse embryos at 11.5 and 12.0 dpc, spanning a critical window in gonad sexual development. Although over 1200 genital ridges were collected for this study, each genital ridge is ϳ1 mm long and ϳ0.1 mm thick and yields Ͻ1 g of protein, limiting the quantity of starting material available for the proteomics analysis. We used SCX to prefractionate tryptic peptides derived from the protein extracts. Outputs of 220 2D LC-MS/MS runs, corresponding to the first 20 SCX fractions from each of the six male and five female gonad sample replicates, were searched against the mouse IPI database. We identified a total of 620 peptides (from 4494 peptide hits) from the combined lists for the male and female embryonic gonads: 296 (48%) were common to both sexes, 324 (52%) were unique to males, and none were unique to females (supplemental Data S1).
Within this data set, 257 peptides uniquely identified 154 "non-redundant" proteins. The remaining 363 peptides ambiguously identified a further 883 "redundant" proteins, giving a total of 1037 protein identifications (supplemental Data S1 and S2). Because we could not conclusively eliminate any of these redundancies (and considering that each listed peptide may have been detected more than once; supplemental Data S1) and in an effort to capture broad proteome information from this small and relatively inaccessible subregion of the mouse embryo, all 1037 non-redundant and redundant proteins were included in subsequent analyses. We refer to these 1037 proteins as the "total" protein data set.
Each protein was either common to both sexes (611; 59%) or unique to males (426; 41%; supplemental Data S2). For subsequent bioinformatics analyses, the protein identifications from each sex were pooled into a single data set, considering that the female data set was a subset of the male.
In total 462 proteins (45%) were convincingly identified by two or more peptides ( Fig. 1 and supplemental Data S2). For the remaining 575 proteins identified by single peptide hits (supplemental Data S2), correct identification was confirmed by manual analysis of the MS/MS CID spectra (supplemental Data S3). Within this subset of proteins identified by single peptide hits, 88 were non-redundant; the remaining 487 were redundant and were identified from 164 single peptide hits (supplemental Data S2). The identification error rate (or FPR) for the total data set was calculated to be less than 1% from a parallel analysis using the reverse IPI mouse database. Thus, the peptides that correspond to 1037 redundant and non-redundant potential gonadal proteins were identified to a high level of confidence.

Classification of Embryonic Gonadal Proteins
Bioinformatics analyses were performed on the total potential protein data set to determine subcellular localization, functional type, and molecular roles of the encoded proteins and to construct networks of biological processes.
Subcellular Localization-The intracellular location of identified potential proteins was predicted using Ingenuity software. Of the 1037 potential proteins, IPA was able to map and analyze 656 (63%; comprising 145 non-redundant and 511 redundant proteins). Of those, 185 were nuclear, 347 were cytoplasmic, 32 were plasma membrane-associated, seven were extracellular, and 85 were of unknown subcellular location ( Fig. 2 and supplemental Data S4). Thus, the screen identified proteins from all cellular and extracellular compartments as might be expected of a whole tissue screen.
Functional Type-The protein data set was characterized according to general protein function using IPA software. The 656 gonadal proteins mapped by IPA consisted of 12 major types including 137 enzymes, 90 structural proteins, 85 transcriptional regulators, 36 translational regulators, 46 transporters, 25 chaperones, and 161 proteins of minor or unknown functional categories (Fig. 3). Functional assignment for individual proteins can be found in supplemental Data S5. The "unknown" group contains many novel and predicted proteins that may include previously unrecognized regulators of gonadal differentiation.
Molecular Roles-The molecular functions of the proteins were retrieved utilizing Ingenuity software. Of the 656 mapped proteins, 310 (47%; comprising 121 non-redundant and 189 redundant proteins) were eligible for IPA functional assignment (supplemental Data S6). Because multiple functions can be attributed to one protein, those with high statistical probability (p Ͻ 0.05) were selected. A total of 49 different functions were assigned; the most significant are shown in Fig. 4. Major categories included RNA post-transcriptional modification, post-translational modification, protein folding, and protein synthesis. More specific molecular functions and the names of proteins for each can be found in supplemental Data S6.
Molecular Network Construction-Biological processes were assigned and networks were constructed using IPA software. Of the 656 mapped proteins, at least 342 (52%; comprising 138 non-redundant and 204 redundant proteins) were known components of existing molecular networks (supplemental Data S7). The network with the highest significance score (of 57) outlines the coordinated activity of 35 genes of which 33 encoded proteins were identified in the embryonic gonadal proteome ( synthesis, closely reflecting the major molecular functions outlined in Fig. 4. The network data emphasize the importance of producing many and varied gene products during the differentiation of a complex and dynamic tissue such as the gonad.

Chromosomal Mapping
The chromosomal position of the gene corresponding to each protein was ascertained by converting IPI numbers to gene symbols and then searching the Unigene mouse EST library. In this analysis, 867 proteins (84%; comprising 153 non-redundant and 714 redundant proteins) were encoded by genes that mapped to known chromosomal positions ( Fig. 6 and supplemental Data S8). A large number of genes mapped to chromosomes 11 (113; 11%) and 2 (95; 9%). Fewer than 20 mapped to each of chromosomes 12, 14, 16, and 18. A further 20 proteins were transcribed from X-linked genes, and two were encoded by their Y-linked homologues ( Table I). The distribution of the mapped genes ( Fig. 6) accorded with the total number of protein-coding genes known to reside on each chromosome. Clusters of 10 or more protein-encoding loci were detected at cytobands 4D, 5G, 10C, 11A-B, 11E, 15F, and 17B, but the majority were evenly distributed across each chromosome.

Tissue Expression
The tissue expression profile of each identified protein was determined by converting protein IPI numbers to gene symbols and then interrogating the mouse EST library as above. Of the 1037 identified proteins, 696 (67%; comprising 147 FIG. 5. Biological network constructed from embryonic gonad proteins. Using IPA software to predict coordinated networks of gene products, 342 genes were found to be components of 24 significant biological networks. Depicted is the highest scoring network containing 33 genes identified in the gonad proteome (shown in gray). Of the 696 ESTs, adult testis expressed 605 (87%), suggesting that the testicular roles of the remaining 13% (81) are restricted to developmental processes. Of those, 74 were expressed in other adult tissues; only seven ESTs were found in both adult and embryonic testis but no other adult tissue (Table II) and therefore likely to have testis-specific roles throughout the life of the organism. Adult ovary expressed 525 ESTs (75%), which may suggest that the ovarian roles of the remaining 25% (171) are restricted to developmental processes. However, all of those 171 were expressed in other tissues, and no embryonic gonadal peptides were identified in our screen that were female-specific (supplemental Data S1); therefore, ESTs with exclusive roles in ovarian development are not represented in our analysis. Finally, six ESTs (1%) were not detected in any adult tissues (Table III), implying that the corresponding proteins function specifically during embryonic gonad development.

Comparison with Previous Proteomics and Transcriptomics Screens
Literature searches and data mining tools were used to compare the total list of 1037 proteins with outcomes of published proteomics and transcriptomics studies.
Proteomics Screens-Two proteomics studies have previously been carried out using embryonic gonad material (34,35). The proteomics data sets were compared by con-  verting protein names and IPI numbers to gene symbols. One screen (35) identified three proteins specifically up-regulated in the testis at 13.5 dpc in the mouse; all three proteins were also detected in the present screen (Fig. 8), indicating their continued expression throughout the early stages of gonadal development. Of the 44 proteins found to be expressed in cultured chicken PGCs by Han et al. (34), 38 had mouse orthologues; 25 (66%) of these were also identified in our current proteomics screen ( Fig. 8 and supplemental Data S10), indicating either a generic or housekeeping cellular function or a conserved role in germ cell development.
Transcriptomics Screens-Several gene expression screens have been carried out using mouse embryonic gonads or flowsorted gonadal cells (16 -26). Of these, none used an experimental design directly comparable with the present study. However, Small et al. (26) conducted a genome-wide Affymetrix microarray expression analysis of mouse whole genital ridges/ gonads at a number of developmental stages. We restrict our comparison here to the outcomes of that study at 11.5 dpc. The transcriptomics and proteomics data sets were compared by converting protein IPI numbers to gene symbols and microarray probe numbers using the BioMart browser. In this way, 707 proteins from our current screen were found to have corresponding gene symbols and Affymetrix probe numbers, which are necessary for cross-comparison of the two types of expression screens. Of those, 655 (93%) were present in both screens ( Fig. 9 and supplemental Data S10). The remaining 52 proteins were unique to the proteomics data set (Table IV and supplemental Data S10).

Association with Human Disorders of Sex Development
The outputs of this screen were examined for potential links to human disorders of sexual development (DSDs), the causes of which are mostly unknown. We identified 68 proteins transcribed from genes with human orthologues mapping to loci associated with 34 separate reports of DSDs (Table V). Of these genes, 26 have been functionally analyzed in mouse models, and targeted null and/or heterozygous mutations of four of those have been shown to affect gonad morphology and/or fertility. For a further 12 genes, homozygous mutation in mice caused embryonic or neonatal lethality, making sexual development and reproductive capacity difficult to ascertain. Thus, we identified 54 genes with potential links to human DSDs whose functions warrant investigation in animal models (Table V). DISCUSSION In this study we mapped the embryonic gonad proteome at the critical period of sexual differentiation in mice. Using 2D LC-MS/MS technology, we generated a novel data set of 1037 non-redundant and redundant proteins. This data set constitutes the most extensive record available of proteins  expressed in mouse embryonic gonads and confirms and extends similar studies conducted at the transcriptomic level. The resulting data set provides valuable insights into the biology and molecular regulation of sex determination, gonadal organogenesis, and germ cell differentiation, identifying new candidates that may regulate normal developmental processes as well as others that may underlie disorders of human sexual development.

Technical Considerations
The mouse embryonic genital ridges used in this analysis are laborious to resect from the embryo and difficult to separate from the adjacent mesonephros. Although more than 1200 genital ridges were collected, the starting amount of protein was low. This issue, in conjunction with the stringency of the screen (only the three most intense peptide peaks from each MS scan were selected for in-depth MS/MS analyses), implies that many peptides and hence proteins were not detected. Accordingly, proteins commonly associated with gonadogenesis (dosage-sensitive sex reversal, adrenal hypoplasia critical region, on Chr X, gene 1 (DAX1), steroidogenic factor 1 (SF1), and Wilms tumor suppressor 1 (WT1)), sexual differentiation (fibroblast growth factor 9 (FGF9), SOX9, and SRY), and PGC pluripotency (mouse Vasa homologue (MVH) and POU domain, class 5, transcription factor 1 (OCT4)) (for a review, see Ref. 2) were not detected, although these are often missed in transcriptomics studies.
On the other hand, the separation and detection capabilities of the 2D LC-MS/MS technologies used provided greater sensitivity than other proteomics screening techniques (27,28,36,37). As a result, the data set evidently included many low abundance proteins because identification by a single peptide typically reflects low abundance (38). All single peptide hits were manually validated to ensure correct protein identification. Moreover, the rate of protein identification error was less than 1%, indicating the high level of accuracy of the protein identifications that were obtained. Given the difficul-ties involved in generating a truly comprehensive proteome data set from such inaccessible starting material, a limited but highly accurate data set is arguably the best outcome achievable using available proteomics technology.
Although redundant proteins, i.e. groups of two or more proteins containing one of the peptides identified, were included in the outcomes of this study, it is expected that only a subset of these is genuinely expressed in the tissue analyzed. An important aim for future studies will be to establish, either through larger studies of this type by using more advanced technologies that may emerge in the future or by directed study of individual candidates, which of the 883 redundant proteins identified in this study are genuinely expressed in developing mouse gonads. Although it is undoubtedly useful to identify candidates from the list of redundant proteins presented here, the lack of certainty regarding their expression should be born in mind in the following discussion.

Comparison with Previous Screens
Our total data set of non-redundant and redundant proteins showed considerable overlap with outputs of published screens, increasing confidence that the overlapping genes and proteins reflect genuine players in gonadal development. In addition, our data set also detected proteins not previously reported at the transcript or protein level, thereby providing new candidates that potentially regulate sex determination, gonadogenesis, and germ cell differentiation.
Proteomics Screens-Despite the technical difficulties associated with collecting small embryonic tissues, two proteomics studies have been attempted previously using embryonic gonad material. The first, a small scale proteomics analysis of whole embryonic gonads, was performed in our laboratory (35) and compared the differential expression of proteins between male and female mouse gonads at 13.5 dpc. Utilizing 2D electrophoresis (2DE) coupled with MALDI-TOF MS, approximately 600 spots were resolved on the gels, and three proteins up-regulated in male gonads were identified. These were heterogeneous ribonucleoprotein A1 (hnRPA1), heat shock protein (HSP) 90 family member B1 (HSP90B1, GRP94, or TRA1), and HSP70 family member heat shock cognate 71 (HSC71 or HSPA8). Using 2DE, HSC71 was found to be phosphorylated specifically in male gonads, suggesting that post-translational modification of proteins is important for gonad sexual differentiation. Our present data confirm that all three proteins (as well as isoforms of 15 other hnRNPs and isoforms of 15 other HSPs) are expressed at the time of sex determination and may therefore be important for that process.
In a second study, Han et al. (34) generated proteome data for chicken gonadal PGCs isolated at stage 28 and cultured for 7-10 days before being subjected to 2DE coupled with MALDI-TOF MS and LC-MS/MS. Approximately 300 spots were observed on the gels, representing 44 proteins, 25 of

Protein Expression in Mouse Gonadal Sex Determination
which were also detected in our current proteomics screen. This overlap is not surprising given that the common proteins (including ␣and ␤-actin (ACTA/B), vimentin (VIM), tropomyosins 1 and 2 (TMP1/2), desmin (DES), albumin (ALB1), ␣-enolase (ENO1), HSPA2, HSPA5, and HSPA8) are structural or housekeeping components of most eukaryotic cells. The ab-sence of the remaining 19 proteins from our list may reflect the different tissues being studied (chicken cultured PGCs versus mouse whole gonads). Six had no mouse orthologues. Despite this overlap and even when using a conservative comparison of 154 non-redundant proteins, the present data set is several times the size of the largest previous proteomics Adaptor-related protein complex 2 ␤ 1 subunit Ap2b1 IPI00378063 Adaptor-related protein complex 2 ␤ 1 subunit Chemokine ( (90) screen and identified more than 1000 non-redundant and redundant gene products that have not been previously reported in embryonic gonads at the protein level. Transcriptomics Screens-Most previous studies have used mRNA expression analysis to elucidate molecular events in sex determination (16, 19 -25, 39). RNA can be readily amplified, eliminating the need for large quantities of tissue. Furthermore, current screening techniques such as microarrays are more sensitive than proteomics methodologies (40,41). However, transcriptomics and proteomics analyses are now considered complementary to provide a thorough molecular description of the cell or organ under study (for reviews, see Refs. 40 -43).
To illustrate this point, we compared our proteomics data set with an equivalent transcriptomics analysis at 11.5 dpc in which Affymetrix microarrays were used to determine gene expression in male and female whole gonads (26). There was considerable correlation between identifications from the different screening techniques; 655 proteins with available probe numbers (93%) were also present at the transcript level. Our screen confirms the presence of the functionally active gene products for those genes.
The transcriptomics screen generated over 8000 outputs, and not surprisingly, 7604 were not detected in our proteomics screen (Fig. 9). However, we detected 330 non-redundant and redundant proteins that were not represented on the microarray and 52 more that were not detected at the transcript level. Identification of the former proteins highlights a limitation of microarray technology. The latter proteins may be stable products of short lived mRNAs or abundant products of rare mRNAs and represent useful leads for further investigation.

Identification of Potential Regulators of Normal Gonad Development
Classifying our total data set by a variety of characteristics (sex of gonad in which the peptide was identified, tissue EST expression, molecular function, subcellular localization, and chromosomal distribution) identified a number of subgroups of proteins (sex-specific, testis-associated, embryo-specific, intracellular, extracellular, and X/Y-encoded) containing novel candidates that potentially regulate sex determination, gonadogenesis, and PGC differentiation. The protein numbers and examples mentioned below are taken from the total data set and are composed of both non-redundant and redundant proteins.
Proteins Specific to Each Sex-Male and female gonadal samples were kept separate throughout the experimental procedure to establish a base line of protein expression for each sex. All peptides (and therefore proteins) were detected in 10q23-q25 None Sex reversal (140) 10q25 10q23-q25 None Sex reversal (141) 17q23-24 CLTC 17q11-qter None PSMC5 17q23-q25 None Sex reversal (142) 17q24 17q23-q25 None Sex reversal (105) 22q13-qter DDX17 22q13. male gonads, but only 48% of peptides (59% of proteins) were detected in the female gonads. The lack of unique female identifications and unequal distribution between the sexes was not unexpected and likely reflects the relative quiescence of the ovarian development pathway compared with the morphologically active testis pathway (for reviews, see Refs. 3,44,and 45). Proteins common to both sexes (59%) are predicted to perform a generic cellular or developmental function within the gonads. Proteins uniquely identified in the testis (41%) include redundant proteins such as two Y chromosome proteins, DEAD box polypeptide 3 Y-linked (DDX3Y or DBY) and ubiquitin-activating enzyme E1 Chr Y (UBEY1), as well as spermatogenic and testis-specific forms of glyceraldehyde-3-phosphate dehydrogenase and a nonredundant protein, nuclear autoantigenic sperm protein. It is tempting to speculate that these and other proteins from the testis-specific subset could be involved in regulating embryonic testis differentiation and development. Further analysis is required to determine the role of these new candidates.
Testis-associated Proteins-A total of seven proteins was found in both adult and embryonic testis but no other adult tissue (Table II), and therefore these proteins are likely to have testis-specific roles throughout the life of the organism. These include the redundant protein DNA segment, Chr 1, Pasteur Institute 1 (D1Pas1, Pl10), an autosomal homologue of the sex-linked genes Ddx3x and Dby. In the mouse, D1Pas1 is expressed only in adult testicular germ cells (predominantly in the nuclei of meiotic pachytene spermatocytes and postmeiotic round spermatids) where it has a role in pre-mRNA processing during spermatogenesis (46 -48). No human orthologue has been reported, and this is the first report of possible D1Pas1 protein expression in mouse embryonic gonads. D1Pas1 mRNA was detected in mouse genital ridges at 11.5 dpc (not 12.5 dpc) by microarray gene expression analysis (26). Its function at this early developmental stage remains to be determined.
Proteins Specific to Embryonic Gonads-Protein expression that is restricted to genital ridges (and not adult gonads) implies a specific function during gonadogenesis. From our screen, 13 and 25% of the total identified potential proteins were expressed only in the embryonic testis and ovary, respectively, and were not expressed in adult gonads (although they were found in other adult tissues). For example, a redundant protein we identified is the conserved nuclear phosphoprotein transcription factor SET translocation (SET), which reportedly exhibits higher expression in the embryonic genital ridges than in adult gonads (49). SET is pertinent to male gonad development and reproductive function in that it activates cytochrome P450c17 gene expression, which is necessary for synthesis of sex steroids by Leydig cells (49,50). SET also functions as a transcriptional activator in oocyte development, but its gene target(s) there is unknown (49). Other proteins identified in this screen as having temporally restricted expression only in the embryonic gonads, including non-redundant proteins such as chemokine (CXC) ligand 15 (CXCL15) and hypoxia up-regulated 1 (HYOU1) protein as well as two redundant histone H2A variants (HIST1H2AH and HIST1H2AK), warrant further investigation of their potential roles in gonad and PGC development and sexual differentiation.
Intracellular, Putative Regulatory Proteins-We identified more than 200 non-redundant and redundant proteins that localize to the nucleus or cytoplasm and have potential roles in regulating gene, RNA, and/or protein activity. These include proteins involved in regulating DNA replication, recombination, and repair (36); gene expression (21); RNA modification (31) and trafficking (10); protein synthesis (37); folding (12) and modification (23); and molecular transport (38 Extracellular Proteins-The plasma membrane and extracellular proteins identified in this screen are particularly interesting as they potentially regulate signaling between somatic lineages that is known to be important for gonadal differentiation. Communication between somatic and germ cells is necessary for regulation of sex-specific germ cell differentiation (4, 5) and appropriately timed sex cord formation in the testis (65,66). Relatively few membrane-bound (32) and extracellular proteins (7) were identified, but these included the redundant protein spectrin ␣2 (SPNA2; Refs. 67 and 68) and non-redundant proteins such as vinculin (VCL; Refs. 69 -71) and talin 1 (TLN1; Refs. 70 and 72), which all have established roles in gonad and germ cell functions. Other extracellular proteins, including non-redundant proteins such as small inducible cytokine subfamily E member 1 (SCYE1, EMAP2, or p43) and the ELR ϩ chemokine CXCL15, are chemotactic cytokines (73) that have not previously been linked with functions in the embryonic gonads and so represent new candidates potentially regulating embryonic gonad development. CXCL15 is not expressed in adult gonads (74), suggesting that any role it might have in gonads is embryo-specific. Many chemokines in the CXC family reportedly promote angiogenesis during development and tumor growth (for a review, see Ref. 73), raising the possibility that CXCL15 may function in establishing the extensive vascular network associated with testis and (to a lesser extent) ovary development. Further investigation is required to validate this proposed function.
Proteins Encoded by Genes on Sex Chromosomes-Proteins encoded by genes located on the sex chromosomes play pivotal roles in instigating sex determination events (8,9). Of particular interest in this screen are protein-coding genes unique to the Y chromosome as these genes are male-spe-cific and potentially direct the differentiation and functioning of the testis. We identified two proteins encoded by Y-linked genes, UBE1Y and DDX3Y. These proteins were redundant for their ubiquitously expressed X-encoded homologues (UBE1X and DDX3X, respectively). Expression of Ube1y is testis-specific and germ cell-dependent (75,76) in both the embryonic and adult mouse (77). Similarly, the RNA helicase Ddx3y is expressed in Sertoli and germ cells of the testis in both the embryonic and adult mouse (48). Ddx3y was also found to be up-regulated from 10.5 dpc in testis development in a microarray screen comparing male and female somatic cells (25). In the mouse, these two genes were considered candidates for the Y-encoded spermatogonial proliferation factor Spy, but this was later shown to be the initiation factor Eif2s3y (78). However, reduction or deletion of the human transcript DDX3Y is commonly associated with spermatogenic damage, azoospermia, and idiopathic male infertility, suggesting that DDX3Y plays an important role in human spermatogenesis (79). The function of these two proteins in mouse embryonic gonads, especially in male germ cell development, remains to be investigated.

Potential Candidates for Human DSDs
DSDs affect ϳ1 in 4500 live births and have significant biological, reproductive, psychological, and social ramifications (for a review, see Ref. 80). Although progress in understanding the causes of DSDs has been made with the discovery of key genes involved in sex determination and sexual development through mutational analyses in humans and functional analyses in mice, most DSDs remain unexplained at the genetic level. In this study we identified 68 proteins expressed in the developing gonads that are transcribed from genes with human orthologues that map to loci associated with DSDs (Table V). These proteins represent candidates that potentially regulate gonadal development and sex determination.
Two examples are the redundant protein serpin 1 mRNA-binding protein (Serbp1 or Pairbp1) and the non-redundant protein KH domain-containing, RNA-binding, signal transduction-associated 1 (Khdrbs1 or Sam68) protein. The human orthologue to Serbp1 maps to chromosome 1p31, whereas Khdrbs1 has a human orthologue mapping to 1p32. The region containing these genes is duplicated in a variety of human DSDs including male-to-female sex reversal with hypergonadotrophic hypogonadism (81), male pseudohermaphroditism (82), and cryptorchidism (83). SERBP1 has been shown to interact with the progesterone receptor membrane component 1 (PGRMC1), forming a plasma membrane complex that modulates the antiapoptotic and antimitotic actions of progesterone in ovarian cells (84,85). The potential function of this protein in the process of sex determination and gonad development warrants further investigation. The present screen constitutes the first report of KHDRBS1 expression during gonadogenesis. Expression of KHDRBS1 protein has been reported in germ cells of the adult mouse testis (86), and Khdrbs1 homozygous null mutant male mice are infertile (87). RNA-binding proteins of the KH type are known regulators of cellular differentiation so determining the possible role of KHDRBS1 in male gonad and/or germ cell differentiation and its potential relationship to human DSD is likely to be a useful avenue of investigation.
A third protein potentially associated with human DSDs is the non-redundant, ubiquitously expressed transketolase (TKT or p68), a key component of the pentose phosphate pathway. The human orthologue of Tkt maps to a locus (3p14.3) that is duplicated in cases of gonadal dysgenesis (88) and sex reversal with dysplastic gonads (89). Homozygous null mutant (Tkt Ϫ/Ϫ ) mice die at or before birth; Tkt ϩ/Ϫ male mice are viable, having reduced gonad mass, whereas Tkt ϩ/Ϫ females exhibit reduced fertility levels (90). The reduction in gonad size suggests a role for Tkt in gonadogenesis and makes it an excellent candidate for further research into its potential involvement in DSDs.
Two other proteins with possible links to human DSDs, Ewing sarcoma breakpoint region 1 (EWSR1) and filamin-␣ (FLNA), both of which are redundant, lead to reduced fertility and/or abnormal gonad morphology when mutated in mice (91,92). Mutation of genes encoding a further 12 proteins associated with human DSDs cause lethality in mice, and so it is not currently possible to evaluate their potential role in gonadogenesis or sexual development. These proteins, as well as those that do not already have mutant models, warrant further analysis, for example, by using conditional knock-out strategies to determine their potential involvement in human DSDs.

Concluding Remarks
We present here the first large scale proteomics survey of the embryonic gonads at the onset of sexual differentiation to provide a valuable resource for understanding the molecular landscape of gonadal development at the protein level. Considering the popularity of performing comparative analyses in the study of embryonic gonad development, a comparison of the differential expression of proteins either between the sexes or between different time points as sexual differentiation takes place will be an important goal for future studies to identify novel proteins involved in sex-or stage-specific events in gonadal differentiation.