Proteomics
Altklausuren
Altklausuren
Set of flashcards Details
Flashcards | 45 |
---|---|
Language | English |
Category | Biology |
Level | University |
Created / Updated | 03.04.2023 / 03.04.2023 |
Weblink |
https://card2brain.ch/box/20230403_proteomics
|
Embed |
<iframe src="https://card2brain.ch/box/20230403_proteomics/embed" width="780" height="150" scrolling="no" frameborder="0"></iframe>
|
Create or copy sets of flashcards
With an upgrade you can create or copy an unlimited number of sets and use many more additional features.
Log in to see all the cards.
peptide MS intensity dependent on
- concentration in the sample
- digestion efficiency
- recovery during sample preparation
- ionization of the peptide
- matrix effect
- specific MS detection properties
chemical labeling
- Chemical labelig: addition of isotope tag to peptides. After protein extraction and digestion. Relative quantification using MS1 or MS2 spectra.
- Tandem Mass Tag TMT: isobaric labeling: all reagents add the same mass -> labeled peptides are isobaric. TMT reporter ions form upon fragmentation and have different masses. Quantification is done using the intensities of the TMT reporter ions recorded in the MS2 spectrum
- Pros:
- Multiplexing: Allows simultaneous measurement of up to 18 samples – no missing values; high throughput
- No increase in complexity of MS1 spectra
- Tag improves peptide detection
- Applicable to any biological sample
- Robust: Quantification not so dependent on reproducible LC conditions, highly quantitative precision because all conditions are measured simultaneously
- Cons
- Limited signal in MS/MS can compromise quantification sensitivity –
- Co-fragmenting peptides can compress ratios leading to underestimation of true ratios -> reduced accuracy due to ratio compression
- Expensive reagents
- Lower identification rate (dominance of abundant peptides limits dynamic range)
- Accumulation of technical variance up to additional peptide labeling step
spike in standard
- Spike-in standars: Super-SILAC: SILAC labeled cells are digested and spiked into tissue samples -> relative quantification of tissue peptides/proteins via heavy spike-in peptides for entire proteome
- Pros:
- Unlimited number of samples
- More accurate than label-free
- Applicable to all biological samples for which Super-SILAC standard can be generated
- Cons:
- Expensive reagents
- Presence of labeled species increases MS1 complexity and decreases dynamic range
- Missing values
- Only ratios of ratios -> decreased quantiative accuracy/precision
- standard needs to reflect sample composition, only proteins present in SILAC reference sample can be quantified
Which strategy is best for which type of experiment? (acquisition)
- DDA/ shoutgun:
- Identification of 1000s of proteins and PRMs
- Global proteome mapping
- Quantification of many proteins in relatively few samples
- IPAD
- Precise, accurate and reproducible quantification over many samples
- With SRM and PRM only 10s to 100s of proteins can be quantified
- DIA
- When you want to quantify a large number of proteins and peptides over many samples
- When you do NOT need maximum selectivity and sensitivity
- When your proteins/peptides of interest change in the future
clinical proteomics - promises and hurdles
Promises
- To study disease biology (identify disease drivers, patient stratification, drug target discovery)
- To identify and validate biomarkers
- To treat patients according to individual needs (“personalized” precision medicine)
Hurdles
- Heterogeneity within the sample (different cell types in tissue)
- Heterogeneity between samples of same tissue (different composition at different sites)
- Reference proteome?
- Inter-patient variability (large cohorts)
- High dynamic range (high sensitivity required)
- Contaminations (e.g. blood in tissue samples, paraffin in FFPE)
- Limited sample quality (ischemia times)
- Limited amount of sample
- Mass spectrometry requires expert knowledge
- Expensive machinery and reagents
- Often not very robust (maintenance, down-times of mass specs)
biomarker development pipeline
- Discovery
- Which biomedical problem?
- What kind of specimen?
- Small cohort (statisitical power?)
- Shotgun proteomics
- identification of significant proteins
- biomarker candidates
- Verification
- What is appropriate patient group? (availability, regional bias, etc)
- What are appropriate control groups? (healthy?, gender, ager etc)
- Medium sized cohort (statistical power)
- Targeted proteomics
- quantification of previously identified biomarker candidates
- Validation
- Collaboration with clinical partners (preclinical trial) and industry (assay development)
- Large cohort size (statistical power)
- Often orthogonal assays, e.g. ELISA
- Confirmation of clinical relevance in clinically relevant tissue and or body fluid in different cohorts in different geographical regions, etc
What to do when analysing data
1. Quality control:
- QC raw data: avoid high backgroud/ low signal/noise ratio, deterctor saturation
- Data distribution: statistical tests require normally distributed data
- PCA analysis: groups samples according to similarity of composition. Identify sample processing errors and batch effects
- systematic errors due to technical variations can be solved by normalisation
2. normalisation
- column (=experiment) wise: normalisation of unequal sample amounts
- row wise (=proteins): in replicate experiments: normalisation of batch effects
3. statistical analysis
- fold change: ratio between 2 conditions
- statistical tests: Student T-test or ANOVA
- volcano plot: p value vs fold change
data visualization and interpretation
- hierarchical clustering
- gene ontology: annotation according to molecular function, biological process and cellular compartment
- StringDB: functional protein association networks
functions of PTMs
- cellular localization
- (de)activation
- signal transmission
- cell interaction and comunication
- degradation
- solubility
- stability
Methods to study PTMs
- targeted; antibody based methods: e.g. western blot, reversed phase protein arrays
- explorative: mass spectrometry -> requires enrichment
Enrichment of PTMs
enrichment to focus our measurement on the sub proteome of interest
- enrichment of phosphopeptides
- affinity enrichement: IMAC: immobilized metal affinity chromatography -> affinity of phospho groups toward metal ions (Fe3+)
- antibody based: ab that is directed against certain PTM (immuno affinity method)
- enrichment of ubiquitination
- antibody based: ab specific for the di-glycyl moiety present on the side chains of ubiquitinated lysine residues after trypsin digestion
Localisation of PTMs
Mass spec
Phosphorylation: neutral loss of 97.97
MS1: peptide mass increased by mass of the PTM
MS2: mass of some fragments shifted by ∆m => depending on where P is, it can be seen on diff. fragments
scoring system tells us which localization is more probable
Ubiquitination: K-e-GG shows no neutral loss
∆m: Lyss + 2x Gly = 242.1
Affinity purification what to consider
fish for interactions using an affinity ligand
to consider:
- specific vs unspecific binding: not all proteins bind specifically -> experimental setup need to distinguish
- interaction dilemma: in affinity purification, mostly proteins with high abundance &/or affinity/residence time seen
- protein abundance: required for affinity purification: high recovery, fast purification methods, high sensitivity analytics (to also see low abundant complexes)
- biochemical realities: affinity & kinetics are affected by experimental conditions
- timing:
affinity purification methods
- Antibody - targeted: co-immunoprecipitation: AB against POI, pulldown of complex, enrichment of bait protein and interacting proteins
- pro: no cloning, fast
- con: not generic, cross reactivity
- antibody - generic: epitope tagging: POI genetically fused to epitope tag
- pro: generic, highly reproducible
- con: tag may influene protein expression
- Tandem affinity purification TAP: POI with dual affinity tag; TEV: cleavage site for TEV protease, BCBP calmodulin binding protein. elution of bait by EGTA
- pro: generic, highly reproducible
- con: tag might influence protein function
Peak capacity
- Peak capacity describes the maximum theoretical number of analytes that can be succesfully seperated with a given column and a set of analytical paramenters.
- It depends on chromatographic resolution and gradient time.
- In practice: highest peak capacities are obtained with long efficient columns and long gradient times
What is meant by “proteomics”? Why is it more complex compared to genomics? Why is it important to study proteins in context (Give examples)?
Proteomics is the large-scale study of proteins: expression, structure, functions.
There is one genome but different proteomes from the same genome. The proteome is the entire complement of proteins, including the modifications made to a particular set of proteins, produced by an organism or system. This will vary with time and distinct requirements, or stresses that a cell or organism undergoes
Contex: in order to understand biological systems at the molecular level, we must analyze proteomes quantitatively, in time and space and under different (patho-) physiological conditions; molecular constituents of biological systems do not operate in isolation, thousands of interactions e.g. mitosis
Bottom-up proteomics is the digestion of proteins into peptides.
Explain the “in-gel digestion” workflow
Two advantages and two disadvantaged of this procedure
in gel digesion: typically performed with samples containing detergents, high amounts of salts
1. Sample reduction and alkylation (DTT) 2. SDS PAGE (desalting) 3. Cut out protein bands 4. Wash Gel and let it dehydrate 5. Reswell in protease solution and digest 6. Extract peptides
+ Simple to perform (scalabe to high troughput)
+ tolerant to samples with "weird" buffer ingredients (e.g. detergents, lipids, DNA)
+ Isolation of proteins of interest possible (single bands)
- sample amount limited to gel capacity
- limited visual sensitivity
- lower peptide recovery
What is “ion pair reverse phase chromatography”? Which types of interactions are established between the analytes and the stationary phase?
Ion pair reverse chromatography is a form of liquid chromatography: has the highest resolution
hydrophobic and electrostatic interactions
The stationary phase is highly hydrophobic, the mobile phase starts with being hydrophilic and increases in hydrophobicity with the time. The elutes are hydrophilic in the beginning and hydrophobic in the end
Interactions: direct hydrophobic interactions between nonpolar peptide side chains and the nonpolar stationary phase & electrostatic interactions between polar side chains of amino acids in a peptide and an amphiphilic ion pairing agent (e.g. triufluoro-acetic acid) that mediates interactions with the nonpolar stationary phase
Ionization of the peptides is necessary for measurements with the mass spectrometer.
Why is MALDI a soft ionization technique? How does the technique prevent analytes from thermal degradation?
MALDI = Matrix Assisted Laser Desorption Ionization
The matrix traps laser energy for desorption (absorption maximum ideally at laser wavelength), protects the analytes from thermal decomposition (hot and cold matrices) and ionises analyte molecules
Soft ionization technique because matrix absorbs the energy of the laser and hleps to transfer it to the analyte causing it to vaporize and ionize gently without too much fragmentation of the peptides.
Soft ionization from the condensed phase (crystals)
MALDI and ESI are both soft ionization methods because they generate gas-phase ions without causing significant fragmentation or degradation of the analyte molecules.
In MALDI, the analyte is mixed with a matrix material that absorbs laser energy and facilitates desorption and ionization of the analyte: Most of the deposited energy by the laser is taken up by the matrix, matrix gets heated, desorption of matrix and material into the vaccum, rapid cooling of evaporized material prevents thermal decomposition. The resulting ions are mainly intact and carry a single charge.
Measuring ionized peptides with mass analyzers.
Name two mass analyzers, explain how they work and how they determine the m/z ratio
For each: state one characteristic in what they are really good at in comparison to others (e.g. mass accuracy, resolution, m/z range, dynamic range, measurement time, quantification)
Which one would you choose for analysis of intact antibody. Explain in one sentence.
- TOF: Analytes (ions) are accelerated with the same voltage (all start at the same kinetic energy) and then travel through a field-free tube (no force applied). The time it takes for an ion to travel from the ion source to the detector (their velocity vary depending on the m/z) is used to determine its m/z value. td = C * sqrt(m/z)
- Resolution increased by reflection mode TOF and by fast detector digitizer
- TOFs can separate ions at a rate of >10.000 spectra/second
characteristic: scan speed 100µsec, m/z range > 500000, resolution >20.000, high sensitivity
Quadrupole: Consists of two positive and two negative poles arranged in corners of a square. RF (radiofrequency) amplitude and DC (direct current) voltages are applied to create an oscillating electric field that allows only ions with a specific m/z ratio to pass through the analyzer (Lower amplitude, ions with lower m/z pass). The ions that pass through the quadrupole analyzer are detected by a detector, which records the intensity of the ion signal. The ion intensity data is then used to generate a mass spectrum.
- Resolution depends on the number of oscillations along the path -> resolution increases with longer rods or decreasing scan speed
- m/z range limitations (up to 4-8000 m/z)
characteristic: dynamic range, also high accuracy, quantification
For the analysis of an intact antibody, I would choose the TOF mass analyzer because it has high resolution, which is important for distinguishing between different antibody isoforms with slight mass differences.
Quadrupole, TOF, Ion trap, Oritrap
- Mass accuracy: 100-1000, 1-10, 100-1000, 1-2 ppm
- resolution: 200-2.000, >20.000, 200-20.000, >100.000
- m/z range: 4-8000, >500000, 4000, 4-8000
- scan speed: 1-10msec, 100µsec, 10-100msec, 20-200 msec
- dynamic range: 1:10000,1:5000, 1:1000, 1:5000
- senstitivity: ++, +++, +++, ++
- quantification: +++, ++, +, ++
Tandem Mass Spectrometry
Explain the workflow. How can we determine the peptide sequence
Spectrum matching in Proteomics. How can we determine the false discovery rate (FDR)?
2 consecutive MS: MS1 one shows m/z of tryptic peptide. The mass spectrometer selects the (20) most abundant ions from MS1 for MS2 -> peptide isolation and fragmentation -> measurement of fragments m/z
Sequence determination through specific amino acid mass, step by step. Or database searching, whole peptide or sequence tag
dynamic exclusion prevents the same peptides to be analysed again and again
Through target-decoy approach: you have 2 databases, one composed of targets and other of decoys. Then search the tandem mass spectra against both databases. The PSMs to decoy sequnces must be wrong matches (false positives). The FDR is determined by FDR = FP / FP + TP = decoys / targets. Then, define a cut-off score (=how many FP you will accept): FDR(x) = decoys with score > x / targets with score > x
How does label free quantification (LFQ) works? Two (relevant) advantages and two disadvantages of LFQ in comparison to label-based methods.
No isotopic label required, no artifitial chemical modification of peptides or proteins: LFQ intensity describes the integrated MS signal (area under the peak) across the chromatographic peak. Therefore, the same peptide is detected in consecutive MS1 spectra. The chromatographic peak is reconstructed based on MS1 signal -> extracted ion chromatogram (XIC). The area under the XIC is a measure for peptide quantity. Relative quantification of the same peptide vs a reference sample or between x conditions.
- Pros:
- Cost and time effective with respect to sample preparation (no need to label/ pre-process the samples)
- Applicable to all identified/ quantified proteins – proteome-wide
- Applicable to all organisms/ tissues/ cells – independent of biology
- Applicable to very large numbers of sample – large scale
- No increase in sample complexity – high sensitivity
- Cons:
- Const and time ineffective in terms of MS measurement time (individual sample prep and measurement, no multiplexing)
- Error- prone: Reproducible sample preparation, chromatography and mass spectrometric performance is critical for this approach! Accumulation of experimental variation along the workflow
- Quantitative precision and accuracy of label free analysis are often lower than those based on stable isotope labeling, particularly when workflow comprises many steps
- Missing values across different sample increases with increasing number of samples
- (Large-scale label-free projects are computationally intensive and large bioinformatics resources may be needed
DDA, Targeted proteomics, DIA
Three differences between DDA and Targeted proteomic
Can we use TMT labeling when we do DIA? Your opinion with reasons
DDA (shotgun) data is analysed without prior knowledge of which proteins are present.
Targeted quantification: analysis of a preselected group of proteins (hypothesis driven, need prior knowledge)
DDA: Quantification of many proteins (1000) in relaively few samples. Targeted: only 10-100 proteins can be identified.
Targeted: More precise, accurate and reproducible quantification (higher sensitivity ad dynamic range)
DIA = Data independent aquisition: attempt to take the best of bot shotgun and targeted: measure verything (withput prior knowledge), the analyse selected anlytes (with prior knowledge)
to quantify large numbers of proteins and peptides over many samples
when no need of maximum selectivity and sensitivity
when POI change in the future
TMT: Tandem mass tags. TMT labeling is a widely used method for quantitative proteomics that allows for the simultaneous measurement of multiple samples in a single mass spectrometry experiment. It involves the chemical labeling of peptide samples with isobaric tags, each of which contains a unique mass reporter ion that can be used to quantify the relative abundance of the labeled peptides across multiple samples.
Combining TMT labeling with DIA provides a powerful approach for high-throughput quantitative proteomics. TMT labeling allows for the multiplexed analysis of multiple samples in a single DIA experiment, enabling the quantification of thousands of proteins across multiple conditions or timepoints.
Furthermore, TMT labeling can reduce the complexity of the DIA data by pre-fractionating the samples prior to analysis, which reduces the number of peptides in each DIA window and simplifies the identification and quantification of the peptides.
Post-translational modifications (PTMs)
Name 5 PTM
why could it be difficult to investigate ubiquitination with TMT labeling
Warum muss man PTMs anreichern
Phosphorylation
acetylation
hydroxylation
methylation
Ubiquitylation
Deamidation
Proteolytic cleavage
glycosilation
TMT reagents contain Lysin residues. Ubiquitination is a post-translational modification that involves the attachment of ubiquitin molecules to specific lysine residues on target proteins. if a lysine residue in the TMT tag is located in the same position as a ubiquitination site on the protein, it can lead to inaccurate quantification or even loss of identification of the ubiquitinated peptide
- anreicherung kinasen: affinity matrices (kinobeads): unselective kinase inhibitors immobilized to sepharose beads. They bind kinases and pull them down from sample
- Since PTMs usually occur at low levels, it is often necessary to enrich proteins prior to analysis to ensure adequate detection. PTM enrichment allows specific PTMs to be isolated in a sample
Investigation of interaction partners of proteins by using affinity purification. You are interested in a protein and want to find its interaction partners. Explain how you would set up such an experiment, name the steps. Critical steps in your experimental workflow/setup?
Choose appropiate affinity tag -> Incubation of affinity tag with proteins -> express and purify tagged protein ->POI added to biological sample -> bind POI -> wash to elute undesired backgroud proteins -> elute POI -> analyse in MS
Co-IP: (antibody - targeted)
Antibody against POI (bait)
Pulldown of AB-POI complex by protein immobilized to beads
Enrichment of bait protein and co-enrichment of interacting proteins
TAP
furnish gene of interest with a dual affinity tag at C or N terminus.
Steps: ProteinA-IgG interaction (1st affinity purification), wash, TEV cleavage, CBP-calmodulin interaction (2nd affinity purification), wash, elution by EGTA treatment
Critical: washing steps: wash away bait
Critical: TEV-cleave: should not cleave POI
Definition proteoform, how does complexity arise in the proteome compared to the genome? ( circa 6 min)
Proteoform = all different molecular forms in which the protein product of a single gene can be found
Complexity due to genetic variations, alternatively spliced RNA transcripts and PTMs
Reasons for why peak broadening occurs? how does it arise and how can it be circumvented? ( circa 6 bis 10 min)
Eddy dispersion, longitudinal diffusion, restricted mass transfer
Eddy: analytes can take different paths through the stationary phase, different paths have different distances
solution: smaller particle size, better packing
Longitudinal: Brownian motion causes analytes to travel in all dimensions. Analyte concentration is lower at the edges of the peak than at the center.
solution: higher flow rate
Mass transfer: Mass transfer between stationary and mobile phase. High flow rates restricts this mass transfer.
solution: lower flow rate
- Bottom up approach: steps in 1-2 sentences with used reagents if possible
- Name one additional step for separation on protein level
1.protein extraction (lysis Buffer(HEPES/Tris; ßME/DTT for disulfide bridges; Detergents e.g. Urea) => disruption: mechanical (detergents eg SDS; chaotropic reagents eg urea) or biophysical (sonification) => Removing contaminants: disruption, digestion, precipitation, extraction
2.Digestion in solution/ in gel (with proteases: Trypsin: cleaves c terminally of lysines and arginines) => reduction & alkylation (DTT/TCEP/ßME) -> peptide separation (reverse phase chr., C18) -> MS analysis -> Database search
Optional:
separation on cell/ organelle level
separation on protein level
protein fractionation: measure different fractions of samples separately, eg chromatography SEC; PAGE
protein enrichment: affinity enrichment eg cofactors/ Kinobeads; Co-immunoprecipitation
Separation on peptide level
peptide fractionation: chromatography
peptide enrichment: antibody based
in solution digestion
typically performed with samples containing chaotrophes (e.g. urea buffers)
1. Reduction & alkylation
2. Proteolytic digestion (must denature sample but keep protease active)
3. Desalting (by solid phase extraction C18 to remove reagents)
+ simple to perform
+ relatively quick
+ independent of sample quantity
- not compatible with weird buffer ingredients
Name the method used for sequencing Peptide with MS/ in Proteomics ( 1min)
FDR needed to be explained (6 – 8 min)
You made a Search of human microbiome against human database and a second one with a larger database containing also bacterial Proteins (Ecoli). Why do you get less hits with the larger database? (circa 4 min)
tandem mass spectrometry
FDR = false discovery rate tells you how many false positives you have by target-decoy (FDR = decoy / target = FP / FP + TP
- target database: sequence database containing all AA sequences known gor eg humans
- decoy dataase: sequence database containing reversed or scrambled AA sequences for the same proteins
- search tandem mass spectra against both databases -> hits to decoys must be wrong matches -> expect to get the same number of random matches in the target database = decoys/targets.
- find score cutoff which results in acceptable FDR typically 1%
Less hits because decoy database is also bigger -> more decoy hits -> higher FDR -> cutoff higher -> less proteins
-
- 1 / 45
-