• Non ci sono risultati.

Protein characterization from natural matrices by maldi tof-tof

N/A
N/A
Protected

Academic year: 2021

Condividi "Protein characterization from natural matrices by maldi tof-tof"

Copied!
115
0
0

Testo completo

(1)Università della Calabria Dipartimento di Chimica. Tesi di Dottorato di Ricerca in METODOLOGIE PER LO SVILUPPO DI MOLECOLE D’INTERESSE FARMACOLOGICO XX Ciclo (CHIM/01). PROTEIN CHARACTERIZATION FROM NATURAL MATRICES BY MALDI TOF-TOF.. Supervisori. Ch.ma Prof.ssa Anna NAPOLI. Candidata. Dott.ssa Donatella AIELLO. Ch.mo Prof. Giovanni SINDONA. Coordinatore.. Ch.mo Prof. Bartolo GABRIELE. A.A. 2005 – 2007.

(2)

(3) For my thoughts are not your thoughts, neither are your ways my ways, said the Lord. (Isaiah 55,8). To every thing there is a season, and a time to every purpose under the heaven: A time to be born, and a time to die; a time to plant, and a time to pluck up that which is planted; A time to kill, and a time to heal; a time to break down, and a time to build up; A time to weep, and a time to laugh; a time to mourn, and a time to dance; A time to cast away stones, and a time to gather stones together a time to embrace, and a time to refrain from embracing; A time to get, and a time to lose; a time to keep, and a time to cast away; A time to rend, and a time to sew; a time to keep silence, and a time to speak; A time to love, and a time to hate; a time of war, and a time of peace (Qohelet 3, 1-8).

(4)

(5) To my mother, I know that you are looking at me. To my family. To my love..

(6)

(7) PROTEIN CHARACTERIZATION FROM NATURAL MATRICES BY MALDI TOF-TOF.. PhD Student Donatella AIELLO.

(8)

(9) Contents Trends in Proteomics.. 1. Sample preparation and purification in Proteomics.. 8. MS and Proteomics.. 12. PROTEIN-EXPRESSION PROFILING: Olea Europaea Olive Pollens.. 29. PROTEIN-EXPRESSION PROFILING and PMF: Structure and Function of proteases in Mastitic Milk.. 47. MAPPING OF PROTEIN MODIFICATIONS: Ole e 1 micro-heterogeneity.. 65. Conclusions.. 83. Appendix.. 89. Bibliography.. 103. i.

(10) ii.

(11) Trends in Proteomics.. Proteomics is “the qualitative and quantitative” comparison of proteomes under different conditions to understand cellular mechanism underlying biological processes, with the important objective of achieving an overview of the proteins expressed at a given point in a time in a given tissue and to identify the connection to the biochemical status of that tissue.1, 2 Protein expression gives the possibility to characterize bioactive markers, to study proteins-proteins interactions or between proteins and other molecular species, to map the presence of bioactive compounds in functional food, in order to characterize proteins with particular properties, for example allergens, structural proteins or carrier in metabolic pattern.3, 4 The complexity of any proteome, time- and cell-specific protein complement of the genome, makes all proteome analysis technically challenging. Determination of proteins in either small or large cells requires methods for separation of protein mixtures into their individual components. Three developments changed the biological landscape and formed the foundation of the new biology.5, 6 The first was the growth of gene, expressed sequence tag (EST), and protein-sequence databases. These resources became ever more useful as partial catalogues of expressed genes in many organisms and culminated in the complete sequence of the human genome. Sequences of plant genomes and those of other widely studied animals also are recently completed or are approaching completion. These genomesequence databases are the catalogues from which much of our understanding of living systems eventually will be extracted. The second key development was the introduction of user-friendly, browser-based bioinformatics tools to extract information from these databases. Such database search tools.

(12) Trends in Proteomics. are integrated with other tools and databases to predict the functions, the locations, the properties of the protein products based on the occurrence of specific functional domains or motifs. The third key development was the improvement of the oligonucleotide micro-array. The array contains a series of gene-specific oligonucleotides or cDNA sequences on a slide or a chip. By applying a mixture of fluorescently labeled DNAs from a sample of interest to the array, one can probe the expression of thousands of genes at once. One array can replace thousands of Northern-blot analyses and can be done in the time it would take to do one Northern.. Genome. Active Genome. Metabolome. Transcriptome. Proteome. Environment. Figure 1. Interrelationships between molecular classes in cells. Nowadays it is possible “seeing” the whole system, but the information contained in these thousands of data points is beyond our ability to interpret intuitively. New clustering algorithms, self-organizing maps, and similar tools represent the latest approaches to rendering the data in ways that biologists and chemists can comprehend them. All improvements give us the possibility to see the complete system, to think big, to imagine a cell with thousands or tens of thousands of genes, that may be expressed in varying. 2.

(13) Trends in Proteomics. combinations translating thousands or tens of thousands of different proteins. Each protein, whether a trans-membrane receptor, a transcription factor, a protein kinase, or a chaperone, expresses a function that assumes significance only in the context of all the other functions and activities also being expressed in the same cell. Currently, proteomic approaches based on the analysis of protein pattern are commonly used, that may provide a more effective evaluated profiling protein for diagnostic purposes, such as two-dimensional polyacrylamide gel electrophoresis (2-DE)7, surface-enhanced laser desorption ionization (SELDI)8, matrix-assisted laser desorption ionization (MALDI)9, liquid chromatography (LC)10, capillary electrophoresis (CE)11, followed by computational image analysis and protein identification using mass spectrometry12. The use of combined proteomic techniques for protein identification is a powerful approach that can give a better understanding about the mechanism of disease in which proteins play major role13, 14. The proteomic approach using different analytical techniques has been successfully used for protein expression analysis, screening, identification and characterization of protein, but some techniques have certain limitations and need to consider the essential factors for solving these problems. Composition of the proteome and analytical methods are the main limitations in the proteomic analysis. In fact, each sample contains a huge diversity of proteins, which show different chemical properties and characteristics. The sample preparation is the most important factor in the first step of proteomic analysis; ineffective steps can lead to loss of valuable samples, time and cost. However, the sample preparation methods can be affected by some essential factors, such as sample extraction, protein solubilization, protease inhibitors, protein concentration, and non-protein contamination. On the other hand, the limitations of the analytical methods are the detection and the quantification of the proteins, usually it is due to the difficulty in detection of low abundant proteins in biological materials. In addition, some techniques of proteomic analysis show problems about reproducibility, sensitivity and accuracy, for example 2DE, problems that are partially overtaken by the use of mass spectrometry.. 3.

(14) Trends in Proteomics. For a better understanding about proteomics, it is important explaining the differences between proteomics and protein biochemistry. Both protein biochemistry and proteomics involve protein identification, but while the first involves a complete sequence analysis, structure determination, and modeling studies to explore how protein structure governs function, the second one is the study of multi-protein systems (Table 1). Protein Chemistry • Individual proteins • Complete sequence analysis • Emphasis on structure and function • Structural biology. Proteomics • Complex mixtures • Partial sequence analysis • Emphasis on identification by database matching • Systems biology. Table 1. Differences Between Protein Chemistry and Proteomics. In Proteomics the focus is on the interplay of multiple, distinct proteins in their roles as part of a larger system or network, the analyses are directed at complex mixtures and identification is partial by sequence analysis with the aid of database matching tools. In other words, the point of proteomics is to characterize the behaviour of the system rather than the behaviour of any single component5, 15, 16. Proteomics encompasses four principal applications 6. 1) mining, 2) protein-expression profiling, 3) protein-network mapping, and 4) mapping of protein modifications (Figure. 2).17 1) Mining is simply the exercise of identifying all (or as many as possible) of the proteins in a sample. The point of mining is to record the proteome directly, rather than to infer the composition of the proteome from expression data for genes (e.g., by microarrays). Mining is the ultimate brute-force exercise in proteomics: one simply resolves proteins to the greatest extent possible and then uses MS and associated database and software tools to identify what is found. 2) Protein-expression profiling is the identification of proteins in particular sample as a function of a particular state of the organism or cell (e.g., differentiation, developmental. 4.

(15) Trends in Proteomics. state, or disease state) or as a function of exposure to a drug, chemical, or physical stimulus. Expression profiling is actually a specialized form of mining. It is most commonly practiced as a differential analysis, in which two states of a particular system are compared. For example, normal and diseased cells or tissues can be compared to determine which proteins are expressed differently in one state compared to the other.. Signal Transduction. Medical Microbiology. Disease Mechanisms. Drug Discovery Target Identification/validation. Glycosylation Protein Expression Profiling. Proteome Mining. Post-translational modification. Differential Display. Phosphorylation Proteolysis. PROTEOMICS. Protein-network mapping Yeast two-hybrid. Yeast Genomics Affinity Purified Protein Complexes. Functional Proteomics. Structural Proteomics. Protein-protein Interaction. Co-precipitation Phage Display. Mouse Knockouts Organelle Composition. Subproteome Isolation. Protein Complexes. Figure 2. Classifications of proteomics approaches. 3) Protein-network mapping is the proteomics approach to determining how proteins interact with each other in living systems. Most proteins carry out their functions in close association with other proteins. It is these interactions that determine the functions of protein functional networks, such as signal-transduction cascades and complex biosynthetic or degradation pathways. However, proteomics approaches offer the opportunity to characterize more complex protein-networks. As such, protein-network profiling represents one of the most ambitious and potentially powerful future applications of proteomics.. 5.

(16) Trends in Proteomics. 4) Mapping of protein modifications is the task of identifying how and where proteins are modified. Many common posttranslational modifications govern the targeting, structure, function, and turnover of proteins. In addition, many environmental chemicals, drugs, and endogenous chemicals give rise to reactive electrophiles that modify proteins. Proteomics approaches offer the best means of establishing both the nature and sequence specificity of posttranslational modifications. Analytical protein identification is built around one essential fact: most peptide sequences of approximately six or more amino acids are largely unique in the proteome of an organism. Put another way, a typical six amino acid peptide maps to a single gene product. Thus, if we can obtain the sequence of the peptide or if we can accurately measure its mass, we can identify the protein it came from simply by finding its match in a database of protein sequences (Figure 3).. Protein mixture. Separation. Proteins. Peptide mixture. Separation. Peptides. MS analysis. IDENTIFICATION. MS data. Figure 3. Depicts the essential elements of the analytical proteomics approach.. 6.

(17) Trends in Proteomics. Most analytical proteomics problems begin with a protein mixture. This mixture contains intact proteins of varying molecular weights, modifications, and solubilities. Before peptide sequences can be obtained, the proteins must be cleaved to peptides. The essence of analytical proteomics is to convert proteins to peptides, obtain sequences of the peptides, and then identify the corresponding proteins from matching sequences in a database. In a typical experiment in proteomics the first step is the sample preparation, the second one is often the separation of mixture proteins, for example by their molecular weight with 1D-gel electrophoresis (SDS–PAGE) or by both their isoelectric point and their molecular weight with 2D-gel electrophoresis (2D-PAGE). Separated proteins are then visualised by staining with silver, Coomassie Blue or fluorescent dyes. In general, proteins are not analysed directly from the polyacrylamide gel although some attempts have been made using direct MALDI-TOF MS analysis from ultra-thin IEF-IPG gel strips18,. 19. .. However, the accurate mass of a protein is usually not sufficient to identify it with confidence in sequence databases20. Stained bands or spots containing the proteins of interest are usually excised from the whole polyacrylamide gel and digested by specific proteases (Trypsin, LysC or other proteases/chemicals with specific cleavage sites). The resulting peptide mixture, extracted from the polyacrylamide matrix and further analysed by mass spectrometry, generates an experimental peptide mass profile specific to the protein. This experimental profile, is then compared to the theoretical masses derived from the in silico digestion at the same enzyme cleavage site(s) of all protein sequences of the database. The proteins in the database are then ranked according to the number of peptide masses matching their sequence within a given error tolerance in mass. This process is called peptide mass fingerprinting (PMF)21,22. A protein is generally considered identified with sufficient confidence when at least five peptide masses are matched with a mass accuracy better than 10 ppm, 15% of the sequence is covered and the next best database hit shows significant less agreement with the experimental data23. An other widespread approach is to digest proteins mix, which is separated by liquid chromatography followed by mass spectrometry analysis.. 7.

(18) Trends in Proteomics. Sample preparation and purification in Proteomics. The sample preparation is absolutely essential in many successful experiments and sometimes is not so simple, because biological materials contain either protein of interest or other interfering substances, such as salts, small ionic molecules, ionic detergent, charged molecules, lipid, and non-protein components, and the protein of interest must be isolated by the proper preparation method. Those substances in the sample may lead to difficulty in protein separation and also disturb the detection and identification in proteome studies, so sample preparation is necessary to deplete or entirely remove the interfering substances in the biological samples prior to analysis. Methods for separating proteins take advantage of properties that vary from one protein to the next, including size, charge, and binding properties. The source of a protein is generally tissue or microbial cells. The first stage, in any protein purification procedure, is to break open these cells, releasing their proteins into a solution called a crude extract. Once the extract or organelle preparation is ready, depending on the type of sample, there are various ways to prepare protein sample for further analysis. The general sample preparation methods of greatest interest in proteomics study are pre-fractionation and enrichment of protein prior to further protein separation by preparative electrophoresis or chromatography24. The basic methods, including precipitations, dialysis, ultra-filtration and gel filtration, can employ to concentrate the sample and to separate the proteins from potentially interfering substances. Commonly, the extract is subjected to treatments referred to as fractionation. Early fractionation steps in a purification utilize differences in protein solubility, which is a complex function of pH, temperature, salt concentration, and other factors. The solubility of proteins is generally lowered at high salt concentrations, an effect called “salting out.” The addition of a salt in the right amount can selectively precipitate some proteins, while others remain in solution. Several are precipitation methods relying on different chemical principles, and they can be performed by ammonium sulfate, trichloroacetic acid (TCA), ethanol, chloroform or acetone25. Although many protein precipitation methods have the advantages for concentrating and eliminating interferences, they also have the disadvantages of protein irreversible denaturation and protein. 8.

(19) Trends in Proteomics. insolubilization. Another old established procedure for reducing the salt concentration in samples is dialysis; its separation based on principles of diffusion that allows the low molecular weight contaminant removal from sample solutions. Using dialysis method can reduce the maximal interfering substances, but it may have lost the protein in sample, get the high volume of interchanged buffer, need to be concentrated and spend more time than other desalting techniques26. In the meanwhile, ultrafiltration can remove high molecular weight polysaccharides and salts with a short time and avoid precipitation. 27,28. . Although. the removal of low molecular weight proteins or interferences can concentrate the protein, but some of high molecular weight interferences are also concentrated that is the disadvantage of this technique. Each sample preparation method has advantages and disadvantages upon the sample composition and the proper preparation method used. Sometime is really fundamental remove the high abundant proteins and enrich the low abundant proteins and enable to increase the quantity of protein identification. 29, 30, 31. ,. because the presence of high abundant proteins reasonably obscures the incidence of low abundant proteins that may act as disease biomarkers 32. Currently, the detection of specific low abundant protein has been studied to increase the dynamic concentration range available for the identification and characterization of proteins by employing commercial removal kits coupled with immune-precipitation technique in different types of antibodies. This techniques combination is sufficient to detect trace proteins, whereas many proteins were less abundant or undetectable. In addition, an immobilized metal affinity chromatography (IMAC) is a separation technique that uses covalently bound chelating compounds on solid chromatographic supports to entrap metal ions, which serve as affinity ligands for various proteins or peptides, making use of coordinative binding of some amino acid residues exposed on the surface, for example exposed histidine residues, which are primarily responsible for binding to immobilized metal ions33. This technique has not only proven to be one of the most effective approaches, which is one-way of reducing sample complexity to further enrich the target proteins, but also used for isolating and selective enriching the phosphoproteins or phosphopeptides from complex mixture proteins34. On the other hand, an alternative. 9.

(20) Trends in Proteomics. enrichment approach of low abundant proteins is the gel filtration chromatography that separates the proteins based on size exclusion. In a typical experiment in proteomics, after the sample preparation, the second really important step is the separation of proteins mixture. Electrophoresis is especially useful as an analytical method, it is an important technique for the separation of proteins and it is based on the migration of charged proteins in an electric field. Its advantage is that proteins can be visualized as well as separated, permitting a researcher to estimate quickly the number of different proteins in a mixture or the degree of purity of a particular protein preparation. Electrophoresis allows determination of crucial properties of a protein such as its isoelectric point and approximate molecular weight, it is generally carried out in gels made up of the cross-linked polymer polyacrylamide that acts as a molecular sieve, slowing the migration of proteins approximately in proportion to their charge-to-mass ratio. Isoelectric focusing is a specific electrophoretic procedure used to determine the isoelectric point (pI) of a protein. A pH gradient is established by allowing a mixture of low molecular weight organic acids and bases to distribute themselves in an electric field generated across the gel. When a protein mixture is applied, each protein migrates until it reaches the pH that matches its pI. Proteins with different isoelectric points are thus distributed differently throughout the gel. Combining isoelectric focusing and SDS electrophoresis sequentially in a process called two-dimensional electrophoresis (2-DE) permits the resolution of complex mixtures of proteins. This is a more sensitive analytical method than either electrophoretic method alone. Two-dimensional electrophoresis separates proteins of identical molecular weight that differ in pI, or proteins with similar pI values but different molecular weights. This separation method has become synonymous with proteomics and remains the single best method for resolving highly complex protein mixtures. Similar to SDS-PAGE proteins separated by 2-DE are visualized by conventional staining techniques, including silver, Coomassie, and amido black stains. Despite the superiority of 2D-SDS-PAGE over other methods as means of resolving complex protein mixtures, the technique presents some problems. The first is the difficulty of performing completely reproducible 2D-SDS-PAGE analyses. This problem becomes important when one wishes to use 2-DE to compare two. 10.

(21) Trends in Proteomics. samples by comparing the images of the stained gels. Differences in protein migration in either dimension could be mistaken for differences in levels of certain proteins between the two samples. A second problem is the relative incompatibility of some proteins with the first-dimension IEF step. Fox example, many large, hydrophobic proteins simply do not behave well in this type of analysis. A third problem is the relatively small dynamic range of protein staining as a detection technique. Spot densities reflect about a 100-fold range of protein concentrations, at best. This means that staining of 2D-gels allows the visualization of abundant proteins, whereas less abundant proteins frequently cannot be detected. However an important advantage is the resolution of proteins into multiple, discrete bands due to the presence of multiple protein forms with different isoelectric points. Several protein modifications may affect pI include glycosylation, phosphorylation, oxidation, and exogenous chemical modifications. In some cases, differently modified variants of the same polypeptide may appear as spot “trains”. Although this degree of resolution can be useful in establishing what different protein forms are present, it can also complicate the problem of estimating relative protein expression in two samples by 2DSDS-PAGE. Anyway, the use of an initial protein separation followed by digestion and analysis is the most widely practiced analytical proteomics approach today. This is based largely on the pre-eminence of 2-DE for protein separations. The biggest single advantage of this approach is the ability of 2D gels to serve as image maps to allow investigators to compare changes in the proteome based on changes in the patterns of spots on the gel. As noted earlier, there are several factors that can confound interpretations of 2-DE gel-spot patterns, but there is no other technique available that provides an intuitive “photograph” of the proteome. However, for lower abundance proteins, 2-DE gels will not prove useful, simply because important proteins cannot be seen. In this case, other separation methods, particularly tandem LC, provide a viable alternative. High-Performance Liquid Chromatography is an important analytical method employed for protein purification. The diversity of stationary phases and separation modes gives to HPLC considerable resolving power. Although HPLC of intact proteins has not become a. 11.

(22) Trends in Proteomics. widely used technique for analytical proteomics, it is nevertheless highly applicable as an initial step to fractionate protein mixtures. Different chromatographic separations are available, including RP, anion and cation exchange, size exclusion, and affinity chromatography. More frequent is the use of HPLC separation of proteins, after digestion. The main foundation for this approach is that it permits to convert a very heterogeneous mixture of proteins to a more homogeneous mixture of peptides, which can be more easily analyzed. The use of combined separation modes in series is referred “tandem HPLC”. For example, strong cation exchange, followed by RP, would apply two completely different separation modes. The tandem LC approach makes possible the identification of peptides from proteins that are present in a mixture at low abundance. This is in contrast to 2-DE, which is inclined to identify more highly expressed proteins. The superiority of tandem LC over 2-DE probably is owing to two factors, one obvious and the other not so obvious. First, proteins are selected from 2D gels for digestion and MS only if they can be visualized by staining. However, the limits of detection of many MS instruments are below the levels at which proteins can be detected by gel staining. Thus, if one cannot see a protein spot to harvest and analyze, no data will be collected on that protein. Second, handling of proteins in mixtures may provide a “carrier effect,” in which the presence of more abundant peptides prevents the loss of less abundant peptides. When one works with very dilute samples with little material (such as would be obtained from a 2D gel spot), the fractional loss due to interaction with surfaces and other processing components is relatively high. MS and Proteomics. Mass spectrometry is become a valuable technique in protein analysis as a result of the development of two new ionization methods, MALDI. 35,36. and electrospray. 37. , that allow. the routine analysis of biopolymers. These methods solved the difficult problem of generating ions from large, non-volatile analytes such as proteins and peptides without significant analyte fragmentation. Because of the lack or minimal extent of analyte fragmentation during the ESI and MALDI processes, they are also referred to as “soft” ionization methods. In fact they are so soft that under specific conditions even non-covalent. 12.

(23) Trends in Proteomics. interactions may be maintained during the ionization process. ESI gained immediate popularity because of the ease with which it could be interfaced with popular chromatographic and electrophoretic liquid-phase separation techniques38. Furthermore, due to the propensity of ESI to produce multiply charged analytes; simple quadrupole instruments and other types of mass analyzers with limited m/z range could be used to detect analytes with masses exceeding the nominal m/z range of the instrument. For different but no less compelling reasons, MALDI also rapidly gained popularity. The time-of-flight (TOF) mass analyzer most commonly used with MALDI is robust, simple, and sensitive and has a large mass range. MALDI mass spectra are simple to interpret due to the propensity of the method to generate predominantly singly charged ions. The method is relatively resistant to interference with matrixes commonly used in protein chemistry. In particular MALDI has a number of advantages over electrospray in that the majority of generated ions are detected and the process is more tolerant towards salt and detergents. Likewise, the instrumentation and spectra are simpler. As a result of this, the measurement of individual samples is more easily automated and adapted for higher throughput. Furthermore, as a solid sample is used, the acquisition can be paused at any time39. Electrospray, on the other hand, is the interface of choice for coupling liquid chromatography with mass spectrometry to allow the analysis of complex mixtures. As a consequence of its strengths, MALDI has been employed as a fast pre-screening tool in proteomic studies in order to identify gel-separated proteins. Due to its sensitivity this leaves the vast majority of the sample for the then optional more time consuming albeit more powerful electrospray-techniques40. Numerous reports document the success MS has enjoyed in studies into the four structural classifications of proteins, namely, the primary structure or linear sequence of amino acids, the secondary structure or the folding of stretches of amino acids into defined structural motifs, the tertiary structure or the overall three-dimensional fold, and the quaternary structure or the spatial arrangement of folded polypeptides in multiprotein complexes. However the application of MS to proteomics has to date been realized mostly. 13.

(24) Trends in Proteomics. for the study of protein primary structures, even if there is an increasing role of MS in the systematic study of protein higher order structures, i.e., structural proteomics, as well as of protein-ligand interactions. Because of their relative softness of ionization, ESI and MALDI have been used in attempts to generate gas-phase ions of non-covalently associated, apparently intact protein complexes for the purpose of studying these structures by MS. Traditionally, proteins have been identified by de novo sequencing, most frequently by the automated, stepwise chemical degradation (Edman degradation) of proteins or isolated peptide fragments thereof41. These partial sequences were occasionally used to assemble the complete protein sequence from overlapping fragments but more frequently for the generation of probes for the isolation of the gene coding for the protein from a gene library. With the growing size of sequence databases, it became apparent that even relatively short and otherwise imperfect sequences (gaps, ambiguous residues) were useful for the identification of proteins. This was done by correlating information obtained experimentally from the analysis of peptides with sequence databases. The concept of identifying proteins by correlating information extracted from a protein or peptide with sequence databases rather than by de novo sequencing was significantly enhanced when it was realized that mass spectrometers were ideally suited to generate the required data. Usually mass spectrometric methods, applied on proteomics, are based on Peptide Mass Mapping Identification or on Protein Identification Using Single Peptides. Peptide mass mapping is based on the insight that the accurate mass of a group of peptides derived from a protein by sequence-specific proteolysis (i.e., a mass map or fingerprint) is a highly effective means of protein identification, quite the reverse protein identification using single peptides depends on tandem mass spectrometry for the generation of sequencespecific spectra for peptides. The principle behind protein identification by mass mapping is therefore quite simple conceptually42; proteins of different amino acid sequence will, after proteolysis with a specific protease, produce groups of peptides the masses of which constitute mass fingerprints unique for a specific protein (Figure 4).. 14.

(25) Trends in Proteomics.. GEL. DATABASE. 648.1 1272.5 492.6 883.2 2978.9 848.2. Mass spectrometry analysis. 812.6 1432.3 3127.1 996.8 702.4 164.9 2748.2. 648.5 1271.7 493.2 2978.3 882.6 364.1 848.9 3128.8. 3514.2 2837.1 263.9 147.4 1429.7 499.6 142.3 640.8. evaluate. is identical to. PROTEIN IDENTIFICATION. Figure 4. Protein identification by mass spectrometry. Therefore, if a sequence database containing the specific protein sequence is searched using selected masses (i.e., the observed peptide mass fingerprint), then the protein is expected to be correctly identified within the database. Various methods that automate this process have been developed and reviewed43. They vary in specific details but share the following sequence of steps: (i) Peptides are generated by digestion of the sample protein using sequence-specific cleavage reagents that allow residues at the carboxyl- or amino-terminus to be considered fixed for the search. For example, the enzyme trypsin that is popular for mass mapping leaves arginine (R) or lysine (K) at the carboxyl-terminus, and the N-termini of tryptic peptides (except for the N-terminal one) are expected to be the amino acid following a K or R residue in the protein sequence.. 15.

(26) Trends in Proteomics. (ii) Peptide masses are measured as accurately as possible in a mass spectrometer. An increase in mass accuracy will decrease the number of isobaric peptides for any given mass in a sequence database and therefore increase the stringency of the search. (iii) The proteins in the database are “digested” in silico using the rules that apply to the proteolytic method used in the experiment to generate a list of theoretical masses that are compared to the set of measured masses. (iv) An algorithm is used to compare the set of measured peptide masses against those sets of masses predicted for each protein in the database and to assign a score to each match that ranks the quality of the matches. Obviously, for a protein to be identified its sequence has to exist in the sequence database being used for comparison. Both protein and DNA sequence databases are equally suited. If DNA sequence databases are being used, the DNA sequences are translated into protein sequences prior to digestion. The approach is therefore best suited for genetically well-characterized organisms where either the entire genome is known or extensive protein or cDNA sequence exists. Clearly, protein identification by peptide mass mapping depends on the correlation of several peptide masses derived from the same protein with corresponding data calculated from the database. For this reason the method is suited neither for searches of EST (Expressed Sequence Tags) databases nor for identification of proteins in complex mixtures if un-separated mixtures are proteolyzed. ESTs present a problem because they only represent a portion of a gene’s coding sequence. Such segments may not be long enough to cover a sufficient number of peptides observed in the mapping experiment to allow an unambiguous identification. Digests of un-separated protein mixtures present a problem for mass mapping because it is not apparent which peptides in the complex peptide mixture originate from the same protein. To overcame this problem PMF is often combined with tandem MS of peptides in an iterative approach where as much information as possible is extracted by mass mapping, and this is followed by tandem MS to resolve the identification of any ambiguous remaining masses. In a MALDI-TOF spectra from real samples, there are typically dozens of m/z signals. Peptide mass fingerprinting software can usually match just. 16.

(27) Trends in Proteomics. about all of these to some entry in a database. However, given errors in m/z measurement, frequent sample contamination, and the presence of unanticipated posttranslational modifications, not all of the matches will point to the same proteins. The simplest approach is to assign the highest score to proteins whose predicted tryptic peptides match the greatest number of m/z signals in the MS data. If we search only one m/z value, then several proteins could be equally good matches. However, as we search a greater number of m/z values, more matches correspond to a particular protein and lead to a greater score for that protein vs others. This fairly simple approach works reasonably well with very good MS data. However, it tends to assign higher scores to larger proteins. However larger proteins yield more tryptic peptides, so the chances of a match to one of these is greater for larger proteins than for smaller proteins. Sponsor (application) Eidgenossische Technische Hochschule (MassSearch) European Molecular Biology Laboratory (PeptideSearch) Swiss Institute of Bioinformatics (ExPASy) Matrix Science (Mascot) Rockefeller University (PepFrag, ProFound) Human Genome Research Center (MOWSE) University of California (MS-Tag, MS-Fit, MS-Seq) Institute for Systems Biology (COMET) University of Washington (SEQUEST). Uniform resource locator (URL) http://cbrg.inf.ethz.ch http://www.mann.emblheidelberg.de http://www.expasy.ch/tools http://www.matrixscience.com http://prowl.rockefeller.edu http://www.seqnet.dl.ac.uk http://prospector.ucsf.edu http://www.systemsbiology.org http://thompson.mbt.washington.edu/sequest. Table 2. Sources for MS-Based Protein Identification Tools. To address these problems, several of the available peptide mass fingerprinting programs use more sophisticated scoring algorithms (Table 2). These algorithms correct for scoring bias due to protein size, in which larger proteins give rise to greater numbers of peptides. They also correct for the tendency of smaller peptides in databases to have a greater number of matches with searched m/z values. Finally, some of these algorithms also apply probability-based statistics to better define the significance of protein identifications. The principal tools available for peptide mass fingerprinting can be grouped into three categories:. 17.

(28) Trends in Proteomics. • First-generation freeware and subscription software tools that assign scores based on the number of m/z values in a spectrum that match database values within a given mass tolerance. These programs include PepSea (http://www.protana.com) and Pept Ident/MultIdent (http://www.expasy.ch/tools/peptident.html). • Second-generation freeware and subscription software tools that employ scoring algorithms that take into account the effects of protein size and peptide length on the probabilities of matching. These include MOWSE (http://srs.hgmp.mrc.ac.uk/cgibin/ mowse) and MS-Fit (http://prospector.ucsf.edu/). • Third-generation software that employs more extensive probability-based scoring to provide a statistical basis for scores and also to estimate the probabilities that matches may reflect random events, rather than true identities. These programs include ProFound (http://prowl.rockefeller.edu/cgi-bin/ProFound) and Mascot (http://www.matrixscience.com/). If a pure protein is digested and the resulting peptide masses are compared with the list of peptide masses predicted for that protein, two observations are typically made. First, not all of the predicted peptides are detected. Second, some of the measured peptide masses are not present in the list of masses predicted from the protein. The first problem, the missing masses, is usually due to a number of problems that can occur both before and during mass spectrometric analysis such as poor solubility, selective adsorption, ion suppression, selective ionization, very short peptide length, or other artefacts that cause sample loss or make specific peptides undetectable by MS. Since a relatively low number of peptide masses are sufficient for the positive identification of a protein, missing peptide masses are not generally considered a problem. In contrast, unassigned peptide masses are a significant problem for protein identification by mass mapping and probably the single biggest source of misidentifications or missed identifications. Thus, to ensure that mass mapping results are reliable, it is important to understand the possible reasons for unassigned masses and to learn how to deal with them44, 45. Unassigned masses may be observed for one or more of the following reasons:. 18.

(29) Trends in Proteomics. (i) Changes in the expected peptide masses by posttranslational modification (e.g., phosphorylation adds a net 80 u to an amino acid mass), art-factual modifications arising from sample handling (such as oxidation of methionine), or posttranslational processing (e.g., amino- or carboxyl- terminal processing). Some of these changes can be anticipated and incorporated into the search algorithm. (ii) Low fidelity proteolysis due to the presence of contaminating proteases that produce peptides unanticipated by the search algorithm (e.g., the presence of chymotryptic activity in a trypsin preparation) or missed cleavage sites. Again, this can be anticipated to some degree by the search algorithms. (iii) The presence of more than one protein in the sample. It needs to be stressed that bands in SDS gels frequently and spots in 2D gels occasionally contain more than one protein, even if the respective features appear concise and sharp. In some cases, additionally present proteins can be detected by iterative database searching with the masses left unassigned to the primary target protein. Keratins and other common proteins represent another source of protein contamination. (iv) The identified protein actually matches a sequence homologue or splice variant of that reported in the database. This must be confirmed using the sequence of genetically well-characterized species. (v) The protein is misidentified (i.e., false-positive). In this context, the specificity of the enzymes employed for protein digestion should be discussed in more detail. Obviously, the higher the fidelity of the enzyme in hydrolyzing peptide bonds, the more reliably the search can be done with a fixed amino- or carboxyl-terminus. The frequent observation that the protease products are not limited to the ones predicted from the expected enzymatic recognition sites is often due to contaminating protease activity but may also be due to a post-translational modification juxtaposed to the recognition site that blocks access by the enzyme or also to an inefficient proteolysis. If this problem is anticipated, algorithms can be programmed to accommodate missed cleavages by allowing a given number to be entered as a parameter. Furthermore, the success of proteases to cleave proteins is. 19.

(30) Trends in Proteomics. dependent on accessibility to open stretches of primary amino acid sequence, and the native three-dimensional structure of the substrate protein will block access to many sites. Data for use with peptide mass mapping are commonly obtained via MALDI-TOF analysis. However, any mass spectrometer capable of generating mass accuracies around 100 ppm or better at 1000 u, in particular ESI-TOF and FT-ICR instruments, can be used to generate a mass map. For MALDI, analytes are spotted onto a metal plate either one at a time or, in a higher throughput format, multiple samples on the same plate. The samples are usually tryptic digests from proteins separated by 2DE, although proteins purified by other separation methods are also compatible with the method. Before deposition of the analytes, the matrix is placed on the plate or mixed in with the sample. The matrix will absorb energy from the laser causing the analytes to be ionized by MALDI (figure 5). The m/z ratio of the ions is then typically measured based on the flight time in a field-free drift tube (as opposed to ion mobility MS where a field pushes ions through a gas) that constitutes the heart of the time-of-flight mass (TOF) analyzer. An additional bonus for samples isolated from biological sources is that MALDI is compatible with biological buffers such as phosphate and Tris and low concentrations of urea, nonionic detergents, and some alkali metal salts. Peptide m/z ratios are calculated based on the energy equation ion E=1/2mv2 that accounts for contributions from kinetic energy, mass, and velocity. At a constant energy, low molecular weight ions will travel faster than high molecular weight ions An inherent problem with the MALDI process is the small spread of kinetic energy that occurs during ionization. The spread reduces the resolving power and prevents the observation of the natural isotope distribution, even of small peptides. Two approaches, an ion mirror (reflectron) and “time-lag focusing” (delayed extraction), have been implemented in commercial instruments to overcome this problem. A reflectron is a device located at the end of the flight tube opposite from the ion source that decelerates the ions and then re-accelerates them back out of the reflectron toward a second detector. This is achieved by applying a decelerating voltage that is slightly higher than the accelerating voltage at the source. It has been observed that ions of lower kinetic energy do not penetrate. 20.

(31) Trends in Proteomics. as far into the reflectron as those of higher energy. Consequently, deeper penetrating highenergy ions can catch up, thereby decreasing the initial energy spread.. Sample plate. Nd:YAG Laser 355 nm. hν + + +. + +. +. AH+. +. Ground Grid. +20 kV. Figure 5. MALDI source. The second approach to correct the initial spread of kinetic energies during MALDI is the time-lag focusing technique initially developed by Wiley and McLaren, in 1953, and more recently reintroduced as “delayed extraction”46. In this method, the MALDI ions are created in a field-free region and allowed to spread out before the extraction voltage is applied to accelerate them for their flight through the drift tube. This results in a significantly decreased energy spread of ions and thus higher resolution. Delayed extraction also limits peak broadening due to metastable decomposition from ions colliding in the source during continuous ion extraction. The effects of these improvements are significant. Delayed extraction can increase the mass resolution to ≈2000-4000 for peptides in a linear instrument and, if combined with a reflectron instrument resolution, can further increase to ≈3000-600047. Large-scale protein identification critically depends on tandem mass spectrometry for the generation of sequence-specific spectra for peptides, the approach called Protein. 21.

(32) Trends in Proteomics. Identification Using Single Peptides. Different amino acid compositions and permutations of an amino acid sequence can result in isobaric peptides. The amino acid sequence of a peptide is therefore more constraining than its mass for protein identification by sequence database searching48. At the mass accuracy achieved with the MALDI-TOF mass spectrometers that are frequently used for peptide mass measurement (10-100 ppm), several peptide masses from the same protein are required for unambiguous identification, whereas the amino acid sequence of even a relatively small peptide can uniquely identify a protein. Tandem mass spectrometers have the ability to fragment peptide ions and to record the resulting fragment ion spectra. For tandem mass spectrometers such as triple quadrupole, ion trap, quadrupole/TOF or TOF/TOF instruments, fragment ion spectra are generated by a process called collision-induced dissociation (CID) in which the peptide ion to be analyzed is isolated and fragmented in a collision cell, and the fragment ion spectrum is recorded. Typically these types of mass spectrometers are used in conjunction with ESI, exception for the TOF/TOF system that is usually used in conjunction with a MALDI source. This instrument is composed by coupling two TOF mass spectrometers together via a collision cell between them. This new design combines the advantages of MALDI such as high sensitivity for peptide analysis, relative insensitivity to salts, surfactants, and other contaminants, with high-energy CID where amino acids such as isoleucine and leucine can be distinguished by side-chain fragmentation. As with other types of sequencing mass spectrometers, a complete CID spectrum can be acquired in a single acquisition, obviating the need to sum as many as 10 spectra as is necessary with PSD on a single TOF mass spectrometer. Additionally, the MALDI-TOF/TOF mass spectrometer promises to be capable of acquiring tandem mass spectra at a rate that is an order of magnitude above the capabilities of IT and QTOF instruments, which will be significant for proteome studies. However tandem mass spectra, generated by the fragmentation of peptide ions in the gas phase at low collision energy, are dominated by fragment ions resulting from cleavage at the amide bonds. Very little amino acid side chain fragmentation is observed. Such spectra are much less complex than the high collision energy spectra generated, for example, in TOF/TOF instruments. The low-energy CID spectra generated by the types of. 22.

(33) Trends in Proteomics. mass spectrometers most frequently used in proteomics are therefore relatively simple to interpret, and a straightforward nomenclature for annotating the MS spectra has been adapted (Figure 6). x 2 y2. H+. R1 H2N. CH C. H N. R2 CH C. O. H 2N. CH C. H N. R3 CH C. O a2 b 2. R1. H N. z2. O Residue mass. c2 R3. R2 CH C. OH. O+. +. H 3N. CH C. OH. O. O b2 ion. y2 ion. Figure 6. Peptide fragment ion nomenclature. The nomenclature differentiates fragment ions according to the amide bond that fragments and the end of the peptide that retains a charge after fragmentation. If the positive charge associated with the parent peptide ion remains on the amino-terminal side of the fragmented amide bond, then this fragment ion is referred to as a b ion. However, the fragment ion is referred to as a y ion if the charge remains on the carboxyl-terminal side of the broken amide bond. Since in principle every peptide bond can fragment to generate a b or y ion, respectively, subscripts are used to designate the specific amide bond that was fragmented to generate the observed fragment ions. b ions are designated by a subscript that reflects the number of amino acid residues present on the fragment ion counted from the amino-terminus, whereas the subscript of y ions indicates the number of amino acids present, counting from the carboxyl-terminus. These individual fragment ion m/z values as. 23.

(34) Trends in Proteomics. shown in figure 6 can be easily calculated from the amino acid sequence, using the nominal (i.e., monoisotopic value rounded to an integer value) residue masses found in Table 3. While it is relatively simple to calculate the elements of the b and y ion series from the peptide sequence, it is much less straightforward to read the amino acid sequence from the CID spectrum of a peptide ion. This is mainly because peptide fragmentation under the conditions encountered in the collision cell of a mass spectrometer are sequence dependent, and the rules for fragmentation are not completely understood. amino acid (3/1 letter codes) alanine (Ala /A) arginine (Arg/R) aspartic acid (Asp/ D asparagine (Asn/N) cysteine (Cys/C) glutamic acid (Glu/E) glutamine (Gln/Q) glycine (Gly/G) histidine (His/H) isoleucine (Ile/I) leucine (Leu/L) lysine (Lys/K) methionine (Met/M) phenylalanine (Phe/F) proline (Pro/P) serine (Ser/S) threonine (Thr/T) tryptophan (Trp/W) tyrosine (Tyr/Y) valine (Val/V) alanine (Ala /A). nominal residue mass 71 156 115 114 103 129 128 57 137 113 113 128 131 147 97 87 101 186 163 99 71. immonium ion mass 44 129 87 88 76 102 101 30 110 86 86 101 104 120 70 60 74 159 136 72 44. Table 3. Residue and Immonium Ion Masses of 20 Common Amino Acids. The CID spectrum of a peptide ion acquired at low collision energy can be considered a composite of many discrete fragmentation events. Each peptide tandem mass spectrum will contain b and y ions as well as other fragment ions that can be used to interpret the amino acid sequence. These include diagnostic ions generated by the neutral loss of specific groups from amino acid side chains (e.g., the loss of ammonia (-17 u) from Gln, Lys, and. 24.

(35) Trends in Proteomics. Arg or of water (-18 u) from Ser, Thr, Asp and Glu) and low mass ions that result from the fragmentation of amino acids down to a basic unit consisting of the side chain residue and an immonium functionality (Figure 6). The b ion series also often shows a satellite ion series in which each signal is 28 u lower than the corresponding b ion. These signals result from the neutral loss of carbon monoxide and are referred to as an a ion series. CID spectra can be further complicated by the presence of internal fragment ions that represent some contiguous sequence of amino acids in the peptide. These are generated if a specific peptide ion undergoes two or more fragmentation events. Empirical observation shows that internal fragments often occur if either proline49 or aspartic acid residues are present in a sequence and even more so at any aspartyl-proline bond, indicating that not all peptide bonds have the same propensity to fragment during low energy CID. For the same reason, even if some of the rules that control peptide ion fragmentation in a collision cell have been determined, others remain to be studied, and of course the relative intensity of fragment ions in peptide CID spectra is uneven and somewhat unpredictable50. Furthermore the choice of the enzyme used for the proteolysis is very important, if proteins are completely digested with trypsin, then lysine or arginine residues will be present at the carboxyl-terminus of all peptides except for the C-terminal peptide of the original protein. A charge sequestered by lysine or arginine at the C-terminus tends to produce a more complete series of y ion fragments than will be generated by peptides produced by protein digestion with chymotrypsin or other protease where lysine and arginine are distributed throughout the sequences rather than at the C-terminus. For peptide mass mapping, the information collectively contained in the masses of several peptides derived from the same protein is used for protein identification by database searching. In contrast, the CID spectrum of a single peptide can, in principle, contain a sufficient amount of information for unambiguous identification of a protein. Therefore, if a mixture of several proteins is concurrently digested, the components of the mixture can be identified based on the CID spectra, provided that at least one CID spectrum per protein is generated. It is hence no longer necessary to separate proteins to homogeneity prior to proteolysis.. 25.

(36) Trends in Proteomics. Tandem MS has now become the definitive approach to determination of peptide sequences. There are two ways to identify proteins from peptide MS-MS spectra. The first is de novo interpretation of the spectrum to obtain a peptide sequence followed by BLAST searching of the sequence against a sequence database to identify the protein. This is a perfectly reasonable approach as long as there are only a few spectra to deal with. Manual de novo interpretation of an individual MS-MS spectrum takes between half an hour and a couple of days, depending on the complexity of the spectrum and the experience of the analyst. As noted earlier, some spectra do not contain complete b- or y-ion series and thus it may not be possible to unambiguously interpret a peptide sequence from these spectra. Unfortunately, the emerging field of proteomics relies on identification of large numbers of proteins from MS-MS spectra. Clearly, the de novo sequencing/BLAST searching approach will be too slow for large-scale protein identification. The “slow step” in this case is the manual inspection of MS-MS spectra to determine sequence. The second approach to protein identification bypasses the “slow step” (manual de novo sequence interpretation). In this approach, algorithms are applied to directly correlate MSMS spectral data with peptide sequences in databases without actually interpreting each MS-MS spectrum individually. The only limitations to such an approach are the quality of the MS-MS spectra and the completeness and accuracy of the databases. If we obtain an MS-MS spectrum of a peptide whose sequence exists in a database, the right algorithm should be able to make the match. The right algorithms can match MS-MS data to protein sequences or to nucleotide (e.g., genome or EST) sequences that are translated to protein sequences. If the sequence of the analyzed peptide does not exist in the database, a correct match cannot be made. The constraints on database searching of a given stretch of peptide sequence are so powerful that the tandem MS spectrum of a single peptide can be adequate for protein identification in an EST database. The approach is easily automated and can also be adapted to find peptides carrying specified posttranslational modifications by instructing the program to anticipate modification at specific residues 51. A list of some Internet sites. 26.

(37) Trends in Proteomics. with protein identification resources developed by these and other investigators can be found in Table 2. Such algorithms use readily available constraints in a decision-making process that distinguishes the correct match from all other sequences in the database. The availability of complete sequence databases, the development of mass spectrometric methods, and the sequence database search algorithms therefore converged into a mature, robust, sensitive, and rapid technology that has considerably advanced the ability to identify proteins and constitutes the basis of the emerging field of proteomics. In this research work Mass Spectrometry assumes a likely central place in the application of some proteomics approaches, where Proteomics is synonymous with “anything to do with proteins” overtaking throughout mining, protein-expression profiling and mapping of protein modifications. Some specific advances were used to characterize proteins with particular properties, for example allergens or endogenous proteases, obtaining chemical information about proteins without preventive classical separation, but only by mean of chemical fractionation procedure followed by mass spectrometry MALDI TOF-TOF. Two natural matrixes were analyzed: olive pollen tree and raw milk from cows affected by mastitis. The first step was that to obtain a reproducible procedure of extraction and fractionation of the total protein content from natural matrixes, followed by the profiling by means of mass spectrometry52. Protein and peptide expression provided the possibility to individuate specific and functional markers, and to characterize post-translational modification involved into the protein bioactivity. Peptide Mass Fingerprint, followed by MS-MS experiments were adopted to identify and characterize peptides and proteins. 24, 23,. 53,54,55.. Protein expression profile by MALDI mass spectrometry was employed to determine the antigenic profile of Olea europaea pollen from different Mediterranean cultivars, followed by the full characterization of the major observed allergen, including post-. 27.

(38) Trends in Proteomics. translational modifications employing the synergic develop of mass spectrometry and bioinformatics tools56. An upgrading of this procedure was employed to obtain the protein MS profile of the content of raw bovine milk, revealing the presence of a functional marker for the acute phase of mammary gland inflammation. It was possible to suggest, also, a new biomarker of mastitis and obtain roundabout information to understand the function of several specific endogenous milk proteases57. Finally, it was necessary wholly characterization of the most important proteins by means of MSMS experiments and database search, using different algorithms, for example PeptideCutter (www.expasy.org) to simulate specific enzymatic cleavage or GlycoMode (www.expasy.ch/tools/glycomod) to identify the glycan forms of an important pollen allergen.. 28.

(39) PROTEIN-EXPRESSION PROFILING: Olea Europaea Olive Pollens.. Proteins are fundamental and integral food components, both nutritionally and functionally, they are a source of energy and amino acids, which are essential for growth and maintenance. Functionally, they affected the physiochemical and sensory properties of various proteinaceous foods. In addition, many dietary proteins possess specific biological properties, which make these components potential ingredients of functional or healthpromoting foods. The proteins playing important role in human diet can be divide into three main groups: animal, plant and microbial proteins. Applied genomics technologies (Transcriptomics, Proteomics, Metabolomics, Nutrigenomics, etc.) contribute to different research areas of the nutritional science and food technology (Table 4). Combining proteomic technologies with genetics, molecular biology, protein biochemistry, biophysics and bioinformatics will result in accelerated discovery of protein functional information. In the study of the proteomes from natural matrixes, the real difficulty is to identify extensively all proteins of a given organ and analyse the physiological events occurring during a definite stage. In vegetable samples were found changes in abundance of proteins during the time of germination, but variation of protein expression were also found during priming (pre-germination followed by drying) a treatment that allow faster germination, imbibition of seed, dehydratation, mobilization of storge proteins and so on58. The proteomes of the different organs of a plant are obviously different. They are often studied separately in proteome database59, but comparison between them are inadequate,.

(40) Olea Europaea Olive Pollens. and most of them are actually related to the study if genetic variations. Several studies have demonstrated that organ-specific proteins are more variable between genotypes than organunspecific proteins, and that the level of genetic variant depend on the organ or tissue considered60. The higher level of genetic variation of organ-specific proteins amounts is probably related to a higher number of genes controlling their expression. Another important difficulty is represented by the absence of specific proteomics database for plant and vegetable proteins, or for food allergens and also for storage proteins. Since the proteomic study on “green plant world” is only at the beginning, the characterization of proteins from some specific natural matrixes (plants, foods, fruits) is more difficult than of proteins from human tissues or cells.. Areas of the nutritional science and food technology 1.. Screening for novel functional bioactives.. • Availability of rapid screening methods for detection of bioactivity.. 2.. Safety evaluation of food constituents.. • Evaluation of absorption, body distribution and metabolism of food ingredients.. 3.. Detection and control of food.. • Identification of biomarkers (metabolites, proteins) specific for particular food spoilage and/or pathogenic microorganism.. 4.. Efficacy testing of bioactive food ingredients.. • Changes in genes expression and proteome relevant to the states or treatment of certain diseases.. 5.. Food allergy.. • Identification of allergic proteins through sophisticated proteomics based on recognition of specific posttranslational modification.. 6.. Quality and authenticity of foods.. • Proteome of certain food (wheat, wine, fish) can be used to authenticate food origin or food quality.. 7.. Production of food ingredients.. • The yield of bioprocess may be controlled through metabolome/proteome of micro-organism used for such production.. 8.. Food processing.. • Proteome and/or metabolome of starter culture of fermentation processes (beer, cheese, sausage, etc.) can be used to predict the quality of the fermented end-product.. Table 4. The term Allergonomics, it was coined to designate the use of proteomics approaches to the study of the allergens. Allergens are defined by their ability to cause the induction of. 30.

(41) Olea Europaea Olive Pollens. hypersensitivity response when encountered by an immune system of sensitive individuals. Inhalation or ingestion of potential allergens leads to production of allergen-specific IgE antibodies. The incidence and severity of allergic disorders is steadily increasing worldwide. Exposure to common environmental antigens is the cause of allergic conditions such as hay fever, allergic asthma, and eczema affecting up to 25% of the population in developed countries. Most of the inhalant or food allergens of plant origin are proteins ranging from 10 to 50 kDa. Pollen grains of various weeds, trees, and grasses are significant source of inhalant allergens.61 Olive (Olea Europaea) pollen is considered as one of the most important causes of respiratory allergic disease in the Mediterranean region. In Spain62, southern Italy63, Greece64 and Turkey65, olive pollen is an important cause of pollinosis. The main pollen season is from April to June. The frequency of olive-induced pollinosis is increasing as a consequence of improved diagnostic procedures and as a result of changes in farming pratices66. Olea europaea pollinosis is clinically characterized by rhinoconjunctival symptomatology than bronchial asthma. Moreover, polysensitization to olive pollen is more frequent than monosensitization50,51. In sourthern Italy, the frequency of positivity to Olea pollen allergens among all skin prick test-positive patients is 13.49% in adults and 8.33% in children. In pollinosis patients of the Naples area, monosensitization to olive was identified in only 1.33% of children and in 2.28% in adults; in all the remaining patients, sensitization to olive pollen was associated with other allergens, mainly derived from pollen grains50. Interestingly, children and adults with monosensitization to olive are frequently affected by year-long symptoms that usually do not increase during the olive-pollen season. The antigenic profile of Olea europaea pollen from different Mediterranean cultivars was obtained by MALDI mass spectrometry using a simple procedure of chemical fractionation of the whole antigen extract. Some of the features of protein structure and distribution probably depend on cultivar adaptation to the environment. Mass spectrometry is currently applied, with success, in protein profiling of natural matrixes.67.68 Our group has developed high-tech analytical methods as tools for the assessment of food quality and safety.69,70 In a survey of all possible allergen candidates. 31.

(42) Olea Europaea Olive Pollens. whose profiling could provide clues for the unambiguous identification of olive cultivars and of their subvarieties, we have undertaken a detailed analysis of olive pollen extracted from eight different typical Mediterranean cultivars. When inhaled, olive pollen of Olea europaea is an important as a causative agent for type I allergy in the Mediterranean area.71,72 More than 30% of the population in this area is affected by type I allergy during the pollination season, and more than 80% of the olivetree-allergic patients are sensitive to the protein Ole e 1, the major olive pollen allergen.73 Several separation methods have been employed for the isolation of the allergens, such as SDS-PAGE,74,75 high performance liquid chromatography (HPLC),76 immunodetection,77 and gel filtration.78 Allergenic candidates of 7, 9, 14, 15, 16, 18, and 36 kDa (Table 5, Part 1 and 2, § Appendix: A.1), whose presence in olive pollen has been ascertained as previously mentioned, have been immune-stained with sera from olive-allergic individual patients.79,80 Allergen name Ole e 1 Ole e 2 Ole e 3 Ole e 4 Ole e 5 Ole e 6 Ole e 7 Ole e 8 Ole e 9. MW, kDa, SDS-PAGE1 18-21 15-18 9.2 32 16 10 9-11 21 46. MW (Da)2. P.I.. Sequence3. 16330 14489 9356 2711 2973 5833 9905-100325 18907 48830. 6.18 5.06 4.49 3.77 4.65 4.96 3.56 4.51 5.21. C C P P C P C C. Accesion number4 P19963 O24169 O81092 P80741 P80740 O24172 P81430 Q9M7R0 Q94G86. Table 5_Part 1. Olive Pollen Allergens with Clinical Relevance Developed and Maintained by Allergen Nomenclature _ Subcommittee of the IUIS (www.allergen.org), Including Allergens Whose IgE Reactivity Has a Prevalence of >5%. Legend: (1) Apparent molecular mass in SDS-PAGE. (2) Theoretical molecular weight. (3) Sequence information obtained by C, cDNA; P, peptide sequence;N, nucleotide sequence. (4) Swissprot database. (5) Mass spectrometry determination. (Table 5_Part 2 continues in Appendix, A1). The concentration level of the major olive pollen allergens, estimated using monoclonal antibodies alone or in combination with gel scanning densitometry,53,81 indicates a variation between plant species.82 Cultivars and probably local variety or sub-varieties of olive trees. 32.

(43) Olea Europaea Olive Pollens. present special features that depend on their adaptation to the environment or ecotype. Ecosystem and crop management are factors that are able to induce changes in the allergenic profile of a given variety or cultivar.83,84 Therefore, pollen protein profiling could be a useful tool for cultivar discrimination. The olive (Olea europaea) pollens of the Mediterranean cultivars, Ottobratica (1), Carolea (2), Dolce di Rossano (3), Cassanese (4), Coratina (5), Nocellara del Belice (6), Villacidro (7), and Sinopolese (8), were selected as case studies to determine a protein profile of the whole extract and to identify and characterize specific proteins without any previous chromatographic or two-dimensional gel separations. A simple procedure of chemical fractionation of the whole antigen extract was developed, whereby less complex, or pure, fractions of antigen candidate were obtained. Portions (50 mg) of pollen grains (1-8) were extracted with 1 mL of aqueous 50 mM NH4HCO3 for 20 min at room temperature, followed by centrifugation at 14 000 rpm for 2 min. The supernatant portion ( saline extract) was separated and stored at -20 °C. A 200 µL portion of whole extract was precipitated with 400 µL of CHCl3/CH3OH 1:3 (v/v), and the precipitated protein pellet was partitioned consecutively, under magnetic stirring and at room temperature, for 10 min with (a) 150 µL of 50 mM NH4HCO3/CH3OH 1:1 (v/v) and (b) 150 µL of 50 mM NH4HCO3/CH3CN 1:1 (v/v) (Chart 1).. Olive tree Pollen Fraction a Whole extract. Pellet Fraction b Liphophilic (s). Chart 1. Procedure of chemical fractionation.. 33.

(44) Olea Europaea Olive Pollens. All fractions (a, b)were directly analyzed by MALDI TOF in the linear mode. A 1 µL portion of each fraction was directly analyzed by linear MALDI using α-cyano-4-hydroxytranscynnamic acid (α-CHCA, 0.3% in TFA) as matrix. MALDI-TOF analyses were performed using a 4700 Proteomics Analyzer mass spectrometer from Applied Biosystems (Foster City, CA) equipped with a 200-Hz Nd:YAG laser at 355-nm wavelength. Linear MALDI MS spectra were acquired averaging 2500 laser shots with a mass accuracy of 500 ppm in default calibration mode that was performed using the following set of standards: insulin (bovine, [M + H]+ avg m/z 5734.59), apomyoglobin (horse, [M + H]2+ avg m/z 8476.78, [M + H]+ avg m/z 16 952.56), and thioredoxin (Escherichia coli, [M + H]+ avg m/z 11 674.48). The high solubility in aqueous medium is an important prerequisite for allergen candidates, because their biological activity better correlated to the concentration and rapid release from airborne particle than to their intrinsic properties.85 The partition coefficient of allergens and the antigenic profile of olive pollen are strongly related to the solvents used for the extraction.86 Whole protein extracts can be, therefore, chemically fractionated, and the antigen contents of each fraction can be varied according to the selected experimental conditions. MALDI mass spectrometry, for its specificity and better resolution in comparison with the conventional 2-D (two-dimensional gel electrophoresis) chromatographic approach, represents the methodology to obtain reliable results in the profiling of olive pollen. The use of saline solution containing sodium chloride, phosphate buffer, and borate buffer either to prepare the whole pollen extract or for its chemical fractionation are not agreeable to direct MALDI-MS analysis. Therefore, ammonium bicarbonate solutions were chosen to prepare the whole antigenic extract from pollen samples of cultivar 1-8. Ammonium bicarbonate should favour the formation of carboxylate/ammonium ion pairs, thus affecting the solubility of the proteins to be extracted in moderately polar solvents, such as acetonitrile/water mixtures. Moreover, ammonium counter-ions have been often used to improve the desorption of high molecular weight protein87 and does not interfere with the mass spectrometric analysis.. 34.

(45) Olea Europaea Olive Pollens. The sample preparation protocol was planned to distribute the amount of information stored in the proteome of each olive pollen in a set of three MALDI spectra that could be independently evaluated and matched to retrieve data for their comparison. The data set displayed by the three spectra provides the entire profiling of a given entity. One lypophilic (s) and two hydrophilic fractions (a-b), respectively, can be obtained from each sample (Chart 1). Accordingly the antigenic profile of Cassanese 4 displayed four allergens: Ole e 7, Ole e 6, Ole e 2, and Ole e 1 in the 5-20-kDa mass range (Figure 7).. [Ole e 6]+. 100%. 5821. [Ole e 7]+ 50. A. 9971 9861 10041 9791. 6959. [Ole e 1]+ 17794. 0. 5000. 100%. 8891 8837. [Ole e 2]2+ 50 7396. 8000. 11000. 14000. 17000 m/z. 20000. 8915 [Ole e 1]+ [Ole e 1]2+. 1 7673. 17829. [Ole e 7]+ 9971 B. 10041 16300 0. 7200. 9760. 12320. 14880. 17440 m/z. 20000. Figure 7. MALDI spectra of fractions (a) 4a and (b) 4b. Both fractions represent a mixture of allergens. Part (a) shows the typical pattern of Ole e 7, Ole e 6, and some polypeptides between 5 and 7 kDa; whereas part (b) shows the ion species [Ole e 2]2+, [Ole e 1]2+, [Ole e 7]+, and [Ole e 1]+.. 35.

(46) Olea Europaea Olive Pollens. The four ion peaks at m/z 9791-10041 mass range can be ascribed to Ole e 7. In fact, the apparent SDS-PAGE molecular mass of this allergen is 9-11 kDa (Table 5_Part 1, column 2, row 8). The predicted molecular mass of the Ole e 7 fragment from the peptide sequence is 2199 Da 88. The only experimental values available are 9905-10302 Da obtained by lowresolution MALDI mass spectrometry (Table 5_Part 1, column 3, row 8). Therefore, it can be suggested that peaks in the range 9791-10041 Da could be correspond to the expected Ole e 7 (Figure 7). The ion peak at m/z 5821 (Figure 7) was attributed to the olive pollen allergen Ole e 6. This allergen has been isolated, purified and biochemically characterized,89 and its specific cDNA was cloned and sequenced90,91 (Table 5, row 7). Considering that the value of 5833 Da (Table 5_Part 1, column 3, row 7) corresponds to the predicted molecular weight from cDNA and that there are no other known allergens in this range, the observed ion peak at m/z 5821 can be ascribed to Ole e 6. Meanwhile, Ole e 2, an allergen that consists of 134 amino acids92,93, probably corresponds to the doubly charged ion at m/z 7396, since it is known that the predicted average molecular mass is 14.4 kDa (Table 5, row 3). The mono e doubly charged ions at m/z 16300-17829 and at m/z 8837-8989, respectively, were ascribed to Ole e 1 (Table 5, row 2). The two a and b fractions (Figure 7) show a significantly different proteic pattern; a complete pool of Ole e 1 isoforms is, really, predominant in hydrophilic fraction 4b. The MALDI spectrum of fraction 4s is characterized by the presence of ion peaks corresponding to low molecular weight proteins as a consequence of the fractionation procedure which lowers the solubility of lower molecular weight proteins in aqueous ammonium bicarbonate. The MALDI spectra of the first hydro-soluble fractions of 1, 3, and 5-8 showed similar protein expression (§ Appendix: A2). In particular, the MS spectrum of fraction 5a (Figure 8a) shows four peaks in the mass range 9.7-10 kDa, whereas that of 3a (Figure 8b) displays one additional peak at m/z 9186. A closer inspection of the main four peaks present in that mass range shows a difference of 70 and 180 mass units between two adjacent peaks and couples, respectively. The allergens. 36.

Riferimenti

Documenti correlati

The lack of tendential equalization of average profit rates can be ascribed to adjustment costs in adopting best practice methods of production, but other kinds of limitations

Retraction phase Wind Traction phase Ground station Mechanics VTOL platform Electric machine Power converter Grid connection Ground sensors Ground actuators Ground control

Si bien se considera, las digresiones de Lope, con la continua referencia que en ellas se encuentra a los dos interlocutores presentes, emisor y destinatario, Lope-narrador y Marcia

A production function f or efficiency frontier is defined as the schedule of the maximum amount of output that can be produced from a specified set of input, given the

The application of the same approach on SARS-CoV infected cells will surely provide the possibility to express all the potentiality of MALDI-TOF mass spectrometry in identifying

The infrequency of urinary tract and blood stream infections caused by Aerococcus urinae is most probably due to the diffi- culties in the identification of this bacterium using

pneumoniae clinical isolates coming from wards of Azienda Ospedaliera Universitaria OO.RR of Foggia (Italy) have been phenotypically classified through MALDI-TOF MS in