Human Liver Proteome Project

The Human Liver Proteome Project is the first initiative of the human proteome project for human organs/tissues and aims at writing a modern Prometheus myth. Its global scientific objectives are to reveal the “solar system” of the human liver proteome, expression profiles, modification profiles, a protein linkage (protein-protein interaction) map, and a proteome localization map, and to define an ORFeome, physiome, and pathome. Since it was first proposed in April 2002, the Human Liver Proteome Project has attracted more than 100 laboratories from all over the world. In the ensuing 3 years, we set up a management infrastructure, identified reference laboratories, confirmed standard operating procedures, initiated international research collaborations, and finally achieved the first set of expression profile data.

The Human Liver Proteome Project (HLPP) 1 is one of the initiatives launched by the Human Proteome Organization (HUPO) (1)(2)(3)(4)(5)(6). As the first initiative on human tissues/organs, HLPP aims to 1) generate an integrative approach leading to a comprehensive protein atlas of the liver, 2) expand the liver proteome to its physiome and pathome to dramatically accelerate the development of diagnostics and therapeutics toward liver diseases, and 3) develop standard operating procedures (SOPs) for other HUPO initiatives. BACKGROUND Once one talks about the liver, one might call to mind the ancient Greek myth about Prometheus. As it is told, Prometheus stole the sacred fire from Zeus and the gods and gave it to the mortals. In punishment, Zeus commanded that Prometheus be chained for eternity in the Caucasus. There an eagle (or, according to other sources, a vulture) would eat his liver, and each day the liver would be renewed. So the punishment was endless, until Heracles finally killed the bird.
The liver is the largest organ in the body, is probably second only to the brain in complexity, and has the main digestive function for the metabolism of most substances. It has a myriad of additional functions beyond digestion, such as the production of red blood cells during embryonic development, the production of various plasma proteins, and the detoxification of xenobiotics; it is also the most effective site for phagocytosis of solid material and the guardian interposed between the digestive tract and the rest of the body. Furthermore the liver is also the source of a major portion of the fatty acids sent to other organs to be used as the primary source of energy in the fasted state and is the major site for synthesis of fatty acids from excess sugar in the fed state. The liver has a central role in activation, catabolism, and excretion of retinols, which play various critical roles in vision, growth, reproduction, cell proliferation, differentiation, and the integrity of the immune system. In addition, the liver plays a major role in determining the pharmacokinetics of a drug because it is the major organ of drug elimination through its metabolic capacity and biliary excretion; it also influences the distribution of drugs via the synthesis of their binding proteins. Liver diseases, such as viral hepatitis, liver cancer, and alcohol-related and drug-related liver injury, are great challenges for modern medicine. Worldwide, there are over 350 million hepatitis virus carriers, and over a million deaths per year (about 10% of all deaths in the adult age range) can be attributed to viral hepatitis, and over a million deaths per year are due to hepatocellular carcinoma.
The transcriptome is an essential basis for proteomic analysis of a given tissue or cell line. Because of its biological, physiological, and pathological significance, the serial transcriptomes of human liver samples, such as fetal liver (7), adult liver (8), and liver cancer (9), have been well established. Those human liver transcriptomes provide considerable resources not only for the integration (data-based) of transcriptome data with corresponding proteome data but also for scaling protein identification (data-based) and protein linkage map (cDNA-based) during liver proteome analysis.

FORMATION OF HLPP
The idea of the HLPP was first proposed at the HUPO Workshop (Bethesda, MD, April 28 -29, 2002) where there was broad interest and support for starting an international cooperative effort to integrate academic, industrial, and government activities into a comprehensive study of liver proteins. The first HUPO Liver Proteome Workshop was therefore held in Beijing (October 22-24, 2002) with the participation of more than 100 scientists from 13 countries (Fig. 1 Sample Collection and Banking (Chaired by Dr. Christian Bré chot, France)-Sample banking is the primary role of HLPP. Only with well defined, standardized samples will it be possible to comprehensively understand the liver proteome and make comparisons between normal and diseased livers; among three races, such as Caucasian, Mongolian, and Negro; and among different species. Therefore, the subproject led by Sample Banking Subcommittee should not only collect, store, and distribute human liver samples among participant laboratories but also make SOPs for donor screening and sample collection, organize global sample collection network, and work with the Bioinformatics group to keep a necessary database for banked samples.
The specific objectives are as follows: • Generation of an international liver tissue network: to build up a worldwide liver tissue collection network and to collect samples effectively. • Collection and distribution of normal liver samples: to screen and collect normal liver tissue as a standard sample and to distribute to HLPP participating laboratories.  • Validation of new discoveries: to validate the novel issues identified from human liver proteome experiments by other subprojects such as new proteins, interactions, modifications, and localizations in physiological and pathological contexts.
• Large scale characterization of antibodies: to build up an immunological and histological platform for automatically screening the large amount of liver-specific antibodies generated by the Antibody Banking subproject.
Expression Profile (Chaired by Dr. Fuchu He, China)-The expression profile of a proteome, composed of all expressed proteins in an organ, tissue, or cell in a given time or at a development stage, is the most basic characteristic of a proteome; other aspects of the proteome (interaction, modification, subcellular location, etc.) are extended alterations or combinations of the expression profile. Therefore, the construction of an expression profile is the first step of a given proteome project.
The objective of the construction of an expression profile is to separate, detect, and identify as many proteins as possible. However, as the dynamic range of protein expression can reach more than 10 10 , the present methods are challenged in the detection and characterization of the low abundance proteins. Therefore, the development of new technologies will play a key role in relieving this bottleneck and will become a frontier of proteome research. Except for the traditional twodimensional gel electrophoresis/MS approach, multidimensional LC-MS/MS, subcellular fractionation, protein prefractionation, and the depletion of high abundance proteins have been used to increase the probability of characterizing low abundance proteins and eventually obtain the most complete expression profile of a proteome.
The specific objectives are as follows: • Development of new methods and technologies especially for high sensitivity identification of low abundance proteins. • Construction of the overall proteome expression profile of human liver.
• Identification of about 10,000 protein entries from various hepatic cell types, subcellular components, and protein fractions.
• Integration of the proteome expression profile of human liver with the human liver transcriptome and the human plasma proteome.

Protein Linkage (Protein-Protein Interaction) Mapping and ORFeome (Chaired by Dr. Pengyuan Yang, China)-Construc-
tion of the human liver ORFeome is an essential step in liver research to go from the genome toward the proteome as well as the bridge that links the liver proteome with its transcriptome. Liver ORFeome could be used by the Localization Mapping, the Modification Profiling, and especially the Protein Linkage Mapping subprojects. The aims of the Liver OR-Feome subproject are 1) to construct a transcriptome of human normal liver, 2) to set up a bank of human liver full-length cDNAs with 10,000 clones of complete ORFs, 3) to express the protein products in large scale, and 4) to clarify the rules by which human liver genes are transcribed and translated.
The overall goal of the Protein Linkage Mapping subproject is to understand the whole protein-protein interaction network in liver. This knowledge involves identification of all proteins in a cellular process, cascade, or pathway and in a protein complex, assembly, or protein machine; elucidation of biochemical, physiological, and pathological networks and their cross-talks; introduction of the original data into a computed theoretical model; and further characterization of interesting interactions. These data will provide insight into the function of important proteins, reveal novel functions of known proteins, find new signal transduction pathways, elucidate relevant networks, and facilitate the identification of potential drug targets in developing novel therapeutic agents.
The specific objectives are as follows: 1) Generation of human liver ORFeome and proteins.
• Identification and banking of human liver ORFeome.
• Expression and banking of human liver proteins from host cells such as bacteria or yeast/insect or mammalian cells. • Establishment of human liver protein chips.
2) Generation of human liver protein linkage map.
• High throughput analyses of liver protein-protein interactions.
• Global separation, purification, and identification of protein complexes in liver cells.
• Large scale validation of liver protein-protein interactions with various methods such as protein microarray, GST pull-down, tandem affinity purification, etc.
Protein Localization Mapping (Chaired by Dr. Alexander Archakov, Russia)-Construction of a proteome localization map is essential to the elucidation of protein function in eukaryotic cells and is an important component of the proteome project. In the postgenome era, localization experiments have so far only pertained to an organelle of the single cell eukaryotic model, such as mitochondria, interior membranes, and nucleus in yeast, due to technical limits. The systematic localization of protein profiles in the cells of complicated organs is vital to the understanding of protein functions in physiological and pathological conditions and is important to finish.
The premise of studying the functions of hepatic proteins is to localize them among or within cells accurately, including localization at different levels of cells, subcellular structures, and cell ultrastructures. This subproject aims to set up an integrated system of protein localization and to apply the system to the large scale and accurate research of protein localization.
The specific objectives are as follows: • Establishment of "top-down" protein localization technique platform that is from subcellular structures to proteins.
• Establishment of "bottom-up" protein localization technique platform that is from proteins to subcellular structures. • Set-up of localization system of multiprotein co-localization at the subcellular level and technique system of protein interaction detection.
• Establishment of the localization plot of proteins in different cells of liver and different subcellular structures and even ultrastructures.

Post-translational Modification Profile (Chaired by Dr.
Young-Ki Paik, Korea)-Post-translational modification (PTM) of proteins plays a very important role in biological processes such as cellular recognition, signaling, differentiation, and growth. The elucidation of PTMs is the most important justification for proteomics as a scientific endeavor. The number of documented protein co-and post-translational modifications has now exceeded 400. The most common modifications include phosphorylation, glycosylation, methylation, acetylation, nitration, sulfation, lipidation, ubiquitination, and proteolytic modifications. These chemical modifications of proteins are crucial to modulate their functions but are not directly coded for by genes.
Because the development of the methods in proteomic research for determining protein modifications just began only in recent years, the approaches available at this time for modification studies have been less than satisfactory. Although some research has involved nitrosylation, methylation, and oxidative and proteolytic modifications, most studies in protein PTM research in proteomics are currently focused on protein phosphorylation and glycosylation. Thus, the development of methodologies for the modification studies should be strengthened at the initial stage of the project, and the targets of these studies should be concentrated on the elucidation of the phosphorylation and glycosylation protein profiles in human normal liver.
The specific objectives are as follows: • Separation and detection of modified proteins on a large scale.

Antibody Banking (Chaired by Dr. Qihong Sun, China)-
Antibodies are unique in their high affinity and specificity and are absolutely critical in proteomics and biotechnology as well as in general molecular cell biology. When used in combination with techniques such as epitope mapping and molecular modeling, monoclonal antibodies (mAbs) enable the antigenic profiling and visualization of macromolecular surfaces. This allows various biomedical applications of mAbs in diagnostics, drug screening and testing, disease monitoring, drug discovery, and biomedical research. More importantly, the use of recombinant mAbs as targeted immunotherapies for cancer and other diseases has begun to show concrete success, and antibody engineering will provide for the development of molecular medicines of the future.
Recent studies using antibodies or antibody mimics and antibody arrays, which are expected to be the tool to bridge genomics and proteomics, have become an important part of the development of proteomics. Unlike PCR and DNA recombinant technologies for DNA/RNA, there is no strong strategy to amplify or enrich low abundance proteins. As a possible solution, antibody banking is expected to be a basic technical support for HLPP. With the development of large amounts of polyclonal and monoclonal antibodies against liver proteins, the antibody bank will provide not only important separation, identification, and detection service for other subprojects like expression profile, modification profile, interaction map, localization map, and liver diseases but also diagnosis kits and remedial antibodies for clinical use.
However, only a fraction of antibodies appears to behave satisfactorily in the chip format for global proteome analysis. This places tremendous emphasis on the new approaches for the faster generation of a large bank of well characterized antibodies and on improved methods of selection and evaluation of novel antibody applications. It has been noted that many individual mAbs and small collections of mAbs exist in the academic and commercial communities. They are often not properly characterized, and more than half of them may not be of sufficiently high quality. To date, there are no large, publicly accessible collections of mAbs in standard format for the research field of proteomics.
The specific objectives are as follows: • Establishment and evaluation of large scale, high throughput antibody preparation platforms. • Production of mAbs specific to 10,000 different kinds of proteins expressed in human liver.
• Development of polyclonal and monoclonal antibodies, antigen/antibody microarrays, and reagent kits.
• Support for the requirements of the HLPP and other HUPO projects.

Bioinformatics (Chaired by Dr. Rolf Apweiler, United
Kingdom)-Large scale scientific research like proteomics produces considerable data and therefore requires powerful informatics support. The general targets of this subproject are a comprehensive bioinformatics platform including data standardization, collection, quality control, storage, analysis, integration, and dissemination; an integrated and comprehensive database system of human liver proteome; secondary databases of liver such as metabolism networks, signal transduction networks, and liver gene expression regulation networks; development of informatics tools like package software; and an annotation system of protein function.
The specific objectives are as follows: • Data standards for protein expression profiles, posttranslational modification profiles, protein-protein linkage maps, and cellular and subcellular localization maps.

Proteomic Analysis of Liver Disease (Chaired by Dr. Laura
Beretta, United States)-Liver diseases are the ultimate targets and are an important part of HLPP. The aims of this subproject are composed of two categories: liver physiome and pathome. "Liver Physiome" will include metabolome, toxicome, pharmacome, and regenerome, which correspond to the proteomic basis of the main hepatic physiological functions such as metabolism, detoxication, pharmacokinetics, and regeneration, respectively. Similarly the "Liver Pathome" could be defined as the integration of "Hepatitisome," "Cirrhosome," "Fibrosome," and "Hepatomaome," which represent the proteomic alterations associated with the major pathological disorders of liver.

PROGRESS IN THE PILOT PHASE
According to the global plan, HLPP could be divided into two phases: 1) pilot phase (2002)(2003)(2004)(2005) and 2) action phase (2006 -2010). Generally speaking, the pilot phase aims 1) to organize global work teams and set up a management structure and global communications, 2) to form SOPs for various targets, 3) to generate a specimen bank, 4) to construct and evaluate technology platforms, 5) to develop HLPP informatics infrastructure, 6) to complete the primary expression profile of human normal liver proteins, 7) to establish ORFeome of human liver, and 8) to generate 1000 antibodies against liver proteins.

Expression Profiling of "Normal" Human Liver and Related Subprojects Standard Operating Procedures of Sample Collection
Considering that protein expression is remarkably variable under different conditions, it is essential for the standardization of liver tissue sampling.
Consideration of Sample Types: Liver Tissue Versus Cell Line-For the pilot phase study, liver tissue rather than a cell line has been selected as the standardized sample.
Consideration of Donors-For better comparison between healthy and pathologic liver tissue, the age, gender, and race of healthy liver sample collections should be tracked, although the first batch of samples for expression profiling from France are not separated by gender considering that the availability of liver samples is limited. Moreover the health condition of donors especially should be well defined. The biochemical, immunological, virological, and Doppler detection before operation and histological examination for resected sample are required.
Ethical Consideration-Informed consent both in agreement with national law and as approved by a germane ethics committee or human study committee is required.
Preparations before Surgery-Donors are free of high lipid food and alcohol at least 8 h before donating to reduce possible effects. The following issues are recorded before the surgery: • Informed written consent from the donor or family.
• Clinical data including age, gender, weight/height, and medications.
• The surgical procedure including the total clamping time and the type of clamping (continuous or discontinuous).
• The sampling procedure including the time elapsed between resection and freezing or processing for cell isolation.
The sampling of liver tissue of a qualified donor should be carried out by the surgeon. The following sources of liver tissue are listed from higher to lower priority on the basis of our experience: • Liver tissue at a distance from a benign focal lesion: hemangioma or focal nodular hyperplasia excluding adenoma. Advantage: The resected lesion is usually surrounded by normal liver tissue. Disadvantage: the surgical resection of these lesions is rare.
• Liver tissue at a distance from a metastasis. Advantage: the surgical procedure is more common. Disadvantage: an immunological reaction is often associated in close contact with the metastasis. Therefore, liver tissue must be sampled at a distance of at least 3 cm.
• Part of a too large liver graft if not used for transplantation. Advantage: highest probability of normal histology. Disadvantage: highest duration of ischemia, which might induce changes in liver proteins.
Cholecystectomy is frequently performed during liver surgery, in particular right liver surgery, and is systematically performed during liver graft preparation. Provided that there is no bile retention in the gallbladder, there is usually no histological abnormality of the gallbladder epithelium in these conditions.
Sampling of Specimens during Surgery-Sampling is performed by the surgeon or the pathologist on the tissue specimen immediately after resection. Macroscopically normal liver tissue is sampled in cubes of ϳ5 mm 3 . The samples are placed in cryotubes, which are immediately immersed in liquid nitrogen. The cryotubes are stored at Ϫ80°C. Immediately before protein extraction, frozen tissue sections will be taken from the same sample for histological analysis to avoid tissue sampling error.
Transportation of Specimens-Frozen specimens (NLS, 15 g) were transported from the Bré chot laboratory to the He laboratory on dry ice in 2 days.
Pooling of Specimens-Considering variation, polymorphism, complication, and representation within populations, it is strongly recommended to use a sample pool of donors instead of individuals. In addition, pooled samples could provide larger volume and more consistency.
Pulverization, mixing, and subpackaging of NLS were performed strictly to avoid contamination. Mortars and pestles are precooled with dry ice. Transfer the frozen tissue to a prechilled mortar and use the pestle to grind the tissue until a powder is formed on dry ice. Pool the samples of liver tissue from different donors together, mix well, and subdivide to 0.5-g aliquots in appropriate tubes. Finally fill the tubes with dry N 2 , freeze at Ϫ80°C, and store for distribution.
The recovery of NLS after preprocessing was 98%, which meets the needs of sample delivery. Pooled specimens are shipped to the eight participating laboratories in dry ice (1.5 g of specimen per laboratory).

Expression Profile of Normal Human Liver
The four Chinese laboratories, which received pooled specimens, chose 12 strategies for expression profile analysis. In two-dimensional gel electrophoresis-based analyses, gel stripes of pH 4 -7 and pH 3-10 were chosen. For protein extraction, sequential extraction was also performed in some laboratories for comparison against total extraction. Based on past experiences, most participant laboratories agreed that the next step of the HLPP expression profiling study should be aimed at fractionated protein analyses based on protein biochemical characteristics and subcellular fractionation. Based on the first set of data collected from the four reference laboratories, a primary analysis was performed. Generally 4975 unique proteins and 2338 groups with 9245 proteins involved were identified by the four groups using the gelbased and shotgun platforms mentioned above. Moreover the functional categories of the protein expression profile were analyzed.

Antibody Production
The strategy to set up a murine hybridoma cell bank has been established against human liver and plasma proteins using unknown and native proteins as the immunogens. To date, about 1000 murine hybridoma cell lines have been established. Of them, more than 100 monoclonal antibodies reacted specifically with the 10 most highly abundant proteins in human liver and plasma.

Data Management
Data from large scale projects like proteome experiment are intrinsically complex, although high level conclusions can only result from data analyses, not from the direct results of experiments themselves. Thus systematic, mature, and effective data management and analysis technology are critical components for any large scale proteome research and will present hard challenges and require much development, training, and management investigation as well as cooperation among teams with different backgrounds.
First of all, the Data Collection Centers of HLPP work closely with the Proteomics Standards Initiative of the European Bioinformatics Institute to facilitate data collection and exchange among all HUPO initiatives. More discussion concerning data management will be held between HLPP and the Human Plasma Proteome Project for integration of data mining between the two initiatives.
The tasks of HLPP data management are generalized as 1) to provide standards for data dissemination and exchange with universality, applicability, and long term validity, 2) to collect all of the HLPP experiment data with explicit or implicit value generated by HLPP participating laboratories, and 3) to provide systematic support for large scale data analysis. Importantly three criteria are our standards for data management: completeness, accuracy, and permanence, i.e. CAP.
Data management of HLPP is performed in two data centers: Beijing (National) Proteome Research Center, Beijing Institute of Radiation Medicine and Bioinformatics Center, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences. The workflow of a data center is currently defined as follows: • Data collection.
• Database importing and releasing and other data management activity.
• Data management logging.

PERSPECTIVES
Establishing the "Solar System" of Human Liver Proteome by the End of the Action Phase-The solar system of human liver proteome (Fig. 2) will be composed of expression profile, modification profile, proteome localization map, protein-protein interaction map, panel of antibodies, specimen bank, database warehouse, liver physiome and pathome, comparative proteomes between health and diseased livers and among races and species (human/mouse), integration of HLPP with HUPO Plasma Proteome Project, and integration of the liver proteome with the liver transcriptome and the human genome.
Making the Modern Prometheus Myth-The liver, the main "chemical factory" and "energy plant" for the body, is just like Prometheus who stole fire for mortals. And liver diseases (for example, hepatitis viruses) are the same as the eagle, attacking the liver and making men suffer from pain or distress. Fortunately as the hero Heracles who finally killed the eagle to rescue Prometheus, HLPP, the modern Heracles, (Fig. 3), will eliminate the liver diseases and benefit human beings in the near future!