2. Mass spectrometry
• A mass spectrometer measures molecular
masses.
• The mass unit is called dalton, which is 1/12 of
the mass of a carbon atom, and is about the
mass of one hydrogen atom.
• If there is a mixture of different molecules in a
sample, all the masses are measured
simultaneously. So you get a spectrum.
4. Each peak corresponds to a different
type of molecule in your sample
100
%
2790.22
1324.60
1265.62
1179.41
2789.22
1325.62
2466.18
2465.20
1326.60 1759.93
1477.62
1327.61
1460.59
1748.86
1478.61
1540.63
1974.94
1760.93
1761.92
1975.93
2356.10
1976.92 2355.11
2179.87
2467.19
2468.20
2469.17
2746.23
2791.23
2792.23
2793.23
2794.20 3104.41
2795.06 3103.43 3106.42
0 m/z
1000 1200 1400 1600 1800 2000 2200 2400 2600 2800 3000
peak list
...
…
2789.22 3597.0
2790.22 5018.0
2791.23 4406.0
2792.23 2868.0
2793.23 1234.0
…
…
5. Three Components of an MS
• A typical mass spectrometer contains
– Ionizer
– Mass analyzer
– Detector
• Ion source charges the to-be-measured molecules.
– Charge can be negative but often positive.
– Two common types: MALDI and ESI.
– John B. Fenn & Koichi Tanaka 2002 Nobel Prize in Chemistry
for Electrospray and MALDI
• Mass analyzer separates ions according to the mass to
charge ratio (m/z) of the ions.
– Iontrap, TOF, Quadrupole, FTICR.
• Detector detects the ions.
6. Ionization (1): MALDI
Matrix Assisted Laser Desorption/Ionization
Sample is co-crystallized with matrix (solid)
Formation of singly charged ions
Koichi Tanaka, Nobel Prize 2002
Other ionization method exists.
7. Mass Analyzer (1) – TOF
• Time of Flight.
+ -
+
Detector
Time of flight is proportional to sqrt(m/z)
Other mass
analyzer exists.
8. Putting Them Together
MALDI Time-of-flight
Drift region (D)
MALDI TOF
Average time in TOF: 10-7 sec : average speed 1-2 x 105 km/h
9.
10.
11. MALDI-TOF Linear
Mass range = 800-200,000
Sensitivity and accuracy decrease rapidly with size !
12. MALDI-TOF Linear vs Reflectron Mode
• Linear = poor resolution due to velocity variation of ions with the same m/z
•Reflectron = Contact lens for a near sighted machine!
Reflectron gives much better resolution for mass < 6,000
13. Protein “identification” with intact mass
• We measure the intact mass of the protein.
• Then search in the protein database to find a
protein with the same mass.
• Good idea but there are too many proteins
with the same mass.
• In the rest of the lecture we study more
sophisticated methods and why protein ID is
important.
17. Back to Basics…
Chemical Composition of Living Matter
27 of 92 natural elements are essential.
Elements in biomolecules (organic matter):
H, C, N, O, P, S
These elements represent approximately 92% of
dry weight.
Organic Matter
Organized in "building blocks"
amino acids polypeptides ( proteins)
monosaccharides starch, glycogen
nucleic acids DNA, RNA
18. Mass (Weights) of Atoms and Molecules
element nominal exact Percent average
mass mass abundance mass
C 12 12.00000 98.9%
13 13.00335 1.1% 12.00115
H 1 1.00783 99.98%
2 2.0140 0.02% 1.008665
O 16 15.99491 99.8%
18 17.9992 0.02% 15.994
N 14 14.00307 99.63%
15 15.00011 0.37% 14.0067
S 32 31.97207 94.93%
33 32.97146 0.76%
34 33.96787 4.29% excercise
19. Mass or Molecular Weight of molecules
Ethyl acetate C4H8O2
4 C12 4 x 12.0000 48.0000
8 H1 8 x 1.0078 8.064
2 O16 2 x 15.99949 31.9898
Nominal Mass: 48 + 8 + 32 = 88
Monoisotopic Mass: 88.0555
Average Mass: 48.04446 + 8.06932 + 31.988 = 88.10178
20. Amino Acids
• There are 20 amino acids. All have the
same basic structure but with different side
chains:
• Examples: side chain group
H
Glycine, or Gly, or G
Arginine, or Arg, or R
21. All the 20 Structures
* Picture copied from Dr. R.J.
Huskey’s website:
http://www.people.virginia.ed
u/~rjh9u/aminacid.html
22. Peptides and Proteins
H
Glycine, or Gly, or G
Arginine, or Arg, or R
GR
N-terminal C-terminal
peptide bonds
23. Mass of Amino Acids Residues
Exact Mass of Amino Acid Residues in Proteins
Gly G 57.02150
Ala A 71.03720
Gln Q 128.05860
Lys K 128.09500
Glu E 129.04270
Note: Leu (L) = Ile (I) = 113.08410
24. Amino Acid Table
AA Codes Mono.
I
O
N
S
O
U
R
C
E
.
C
O
M
AA Codes Mono.
Gly G 57.021464 Asp D 115.02694
Ala A 71.037114 Gln Q 128.05858
Ser S 87.032029 Lys K 128.09496
Pro P 97.052764 Glu E 129.04259
Val V 99.068414 Met M 131.04048
Thr T 101.04768 His H 137.05891
Cys C 103.00919 Phe F 147.06841
Leu L 113.08406 Arg R 156.10111
Ile I 113.08406 CMC 161.01467
Asn N 114.04293 Tyr Y 163.06333
25. Cysteine
Proteins are often treated so that cysteine becomes
carboxyamidomethyl cysteine (CamC) or Carboxymethyl
(CmC) in order to break the disulphide bonds.
CamC = 160.03
26. Mass of Peptides and Proteins
Ala-Ser-Phe (ASF)
tripeptide (MW 71.04+87.03+147.07+18.01)=323.15
More precisely: monoisotopic mass 323.1481
average mass 323.3490
27. In a mass spectrum
Deconvolution adds all the isotopic
peaks to the monoisotopic peak.
So, the later process does not need
to worry Monoisotope peak about the isotopes.
isotope peaks
323.15 324.15 325.15
31. Ionization (2) – ESI
Electrospray Ionization: Formation
of Charged Droplets
Formation of multiply charged ions
32. Multiply Charged Ions
• The same molecules may be charged
differently, and therefore form a few peaks in
the spectrum.
323.15
324.15
325.15
162.08
162.58
163.08
(M+3)/3 (M+2)/2 (M+1)/1
m/z
For protein/peptide with positive charges, the charge is obtained from adding
protons (which has mass approx. 1 dalton. As a result, a molecule with mass M
will have peaks at (M+Z)/Z
33. How to determine charge states?
• Isotope ions when resolution is enough.
• Check different charge states when resolution
is not enough.
40. Convex Hull
• A convex hull is such that all the data points
are above the lines and their extensions.
41. How to calculate convex hull?
• Stack S contains all the data points that
form the convex hull so far.
• Data point D[i] = (D[i].x, D[i].y).
Algorithm:
1. S.push( D[0] ); s.push(D[1])
2. for i from 2 to n
2.1 while D[i], S.top(), S.secondtop() are
concave
2.1.1 S.pop();
2.2 S.push(D[i]);
3. return S
S.top()
S.secondtop() D[i]
42. Analyze the convex hull algorithm
• Correctness
– The algorithm finishes.
– The output is a convex hull.
– The proof will be included in an assignment.
• Time complexity
– O(n) time.
– Proof: each point is checked only once, and added
to (and therefore removed from) the stack at
most once.
43. Summarize of spectrum preprocessing
• Baseline correction
• Centroiding
• Charge recognition and deconvolution
• Noise removal