LECTURE: COMPOSITION
inorganic composition
molecular species
H, O, C, N, Ca - living organism
O, Si, Ca, Fe, Ca - earth crust
Da = 10^-24 g
virus
protein > DNA > water
bacterial cell
water >( protein > RNA > DNA; 30%)
human fibroblast
water > RNA > protein > DNA
30-50% of genome has no known function
15-20% are unique to each organism
complexity of genome by number of genes? not a good indicator of complexity
molecular weight of cell components
organelle(10^9), macromolecular complexes(10^6 - 10^9), individual macromolecules(10^4 - 10^9), monomeric subunits(10^2 - 10^3), inorganic subunits(< 10^2)
serum composition units and scale? (roughly, biggest component?)
Cations : Na+(135mM) > K+(5 mM) > Ca++(3 mM) > Mg++(1.2 mM)Anions: Cl-(105mM) > HCO3-(30 mM) > phosphate (1.5mM)
Proteins: total = 80 g/l, albumin (60 g/l), most abundant protein in serum
fuels: glucose (6mM)
hormones (200pM)
BMI = 65 kg/1.73m = 35.8
pico 10^-12
what's the range of light microscope?
- chloroplast to fish eggs
electron microscope?
- small molecules to plant and animal cells
smallest bacteria 0.3 microns
smallest eukaryote 4 microns
largest virus
RNA most reliable for genome complexity
LECTURE : STRUCTURE OF AMINO ACIDS AND PROTEINS
amino acid = amino group, carboxyl group and R and hydrogen
L vs D isomers - chirality of amino acid
peptide - strand of amino acid linked by peptide bond(formed by condensation)
cofactor - iron, copper, manganese, etc that facilitate enzyme binding, cofactors can be inorganic or organic
Aliphatic, non-polar
Glycine G (Gly)
Alanine A (Ala)
Proline P (Pro)
Valine V (Val)
Leucine L (Leu)
Isoleucine I (Ile)
Methionine M (Met)
Aromatic groups
Phenylalanine F (Phe)
Tyrosine Y (Tyr)
Tryptophan W (Trp)
Polar, uncharged
-OH, -S
Serine S (Ser)
Threonine T (Thr)
Cysteine C (Cys)
- carboxamide
Asparagine N(Asn)
Glutamine Q (Gln)
Positively charged
Lysine K(Lys)
Arginine R (Arg)
Histidine H (His)
Negatively charged
Aspartate D (Asp)
Glutamate E (Glu)
Angles
peptide bond formed Merrifield synthesis
Almost all peptide bonds in protein are trans, except X-Pro linkage that is cis/trans
peptide bond planar, cannot rotate
C(α) – N => φ angle
C(α) – C => ψ angle
Free to rotate
Conformations of protein can be specified this way
Amino acids can be post-translationally modified
- cross links (very common)
Cysteine, disulfide residues used to link two polypeptides together, stabilize protein
- phosphorylation (common in signal transduction)
phosphate group added to Ser, Thr, Tyr, His
- Glycosylation (targeting, cell-surface display)
- Methylation (epigenetic code in nucleosome)
Lys have methyl groups attached to them
- Hydroxylation
Hydroxyl added to proline structure, defect w/o hydroxylation in connective tissue
Common in various types of collagen structure, defect -> loose joints
- lipidation (targeting, signal transduction)
dipeptide
oligopeptide
protein (~40 a.a.)
enzymes -> ~100 a.a.
primary structure a sequence of R groups
talk about peptide from N-terminus to C-terminus
Alanyl-Glycine (Ala-Gly)
Glycyl-Alanine (Gly-Ala)
N-terminus peptide is translated first and then ends in C-terminal peptide
alanine, glutamic acid, leucine, methionine are preferred in alpha helix, glycine and tyrosine and serine almostn ever found in alpha helix
proline is common beta turns
Sickle cell anemia
Molecular disease - Linus Pauling
Glu (6) – Val (6) in beta chain of Hb ; hydrophilic to hydrophobic; the hydrophobic amino acid may try to aggregate with other hydrophobic molecules around the subunit, causing polymerization -> sickle cell shape
Protein purification
- Lyse cells
- Centrifugation
- Fractionation (salting out)
- Column chromatography
- difference in protein charge, size, binding characteristics (affinity, antibodies, tag (His residue at N terminus) bind to nickel, very powerful)
chromatography types - affinity based most commonly used
ion exchange, beads negative, proteins with negative charges elute faster
Gel electrophoresis to check purity
Electric field, protein migrate through gel as a function of molecular weight
SDS denatures protein - disulfide bond breaks?
Sequencing protein
direct sequencing of peptides by chemical methods, difficult, rarely used; N-terminal sequence (Edman degradation), C-terminal (enzyme carboxypeptidase)
DNA sequencing of genes Most common way of obtaining massive amounts of sequence
Mass spectrometry of peptides (powerful) common way to identify particular protein after gel electrophoresis in complex mixtures
Reverse genetics
Protein -> all possible DNA sequences -> isolation of the gene; looking for possible genetic phenotypes from known sequence
Protein structure / folding
Favored by hydrophobic effect, hydrogen bonds, electrostatics
Modulated by side chain/main chain steric interactions
Opposed by loss of entropy
Non-covalent bonds
H-bond
Ionic
Hydrophobic
Van der Waals
LECTURE : STRUCTURE OF NUCLEIC ACIDS
Chemistry of bases
- bases are flat rings
- bases are stacking on each other; they are planar and pi-orbital reinforce the stack
o purine stack better than pyrimidines
- bases can form hydrogen bonds
o hydrogen bonds essential in maintaining tertiary structures
- Chargaff’s rule (same amount of G/A and C/T)
- Bases undergo tautomerism and resonance e.g. cytosine, then it can bind to thymine??
Nucleic acids are negatively charged (phosphate groups)
- Proteins that bind to them tend to be positively charged
2’ OH group (DNA has H in 2’) makes the RNA more unstable, RNA can be hydrolyzed very easily by base
5' of carbon of sugar is where the phosphate is attached, and 3' is where is the hydroxyl group, 1' is where the base is
nucleoside - no phosphate group
NMP, NDP, NTP
RNA structure and function
- mRNA(messenger) is one function
- cellular RNA single stranded
- many RNA form complex structures
- many RNAs associated with proteins (RNP), e.g. telomerase, ribozyme
- RNAs catalytic activity (ribozyme)
- miRNA play regulatory roles
- non-coding RNAs, mRNA, tRNA, miRNA present in both prokaryote and eukaryote
Enzymatic heart of ribosome is ribozyme (PT catalytic site no green); ribosomes are mainly composed of RNA and little protein
LECTURE: pH
important properties of water
Kw = 1.0 * 10^-14 = [H+] [OH-], (mol/dm^3)^2
pH = -log[H+]
[H+]=10^-7; pH = 7
[HA] = [H+][A-]
Ka = ([H+][A-])/[HA]
-logKa = pKa
[H+] = Ka*[HA]/[A-]
-Log[H+] = -log(Ka) – log([HA]/[A-])
pH = pKa + log[A-]/[HA]
blood pH slightly alkaline
bigger Ka, stronger acid, lower pKa, stronger acid
when pH = pKa, max buffering capacity
Henderson Hasselbalch equation remember???
-------------------------------------------------------------------------------
DNA REPLICATION
General properties of DNA replication
- semi-conservative
- nucleotide added at 3’ end; DNA polymerase moves from 5’ to 3’ direction
- requires an RNA primer; primase synthesizes short RNA primer
- semi-discontinuous
o strand that is continuously synthesized in the direction of fork movement referred as leading strand; the other strand is lagging strand template, synthesized in short fragments Okazaki fragment
o asymmetric process: one continuous, one discontinuous
Okazaki fragments are joined by DNA ligase
- initiation occurs at defined origin of replication
- replication forks are bidirectional
DNA polymerase
- cannot start de novo
- requires an RNA primer, made by primase
Ligase joins Okazaki fragments
- RNA primer removed, then the nick sealed by DNA ligase
DNA replication enzymology
- DNA polymerase for leading and lagging strand
- DNA helicase – separate strands
- Gap in lagging strand template between Okazaki fragments; single stranded DNA is stabilized by single-stranded binding protein (SSB in prokaryote, RPA-replication protein/factor A, in eukaryote)
- Sliding clamp: encircles duplex (newly synthesized) DNA, slides along DNA, helps tether DNA polymerase to the template
o Beta-clamp in prokaryote (dimer)
o PCNA in eukaryote (homotrimer), PCNA also serves as landing pad for many DNA repair and checkpoint proteins
- Clamp loader loads sliding clamp to DNA
- Topoisomerases break DNA strands, move them around and rejoin them
o Splits a DNA strand and pass the intact strand and then put the split DNA back together
o Also important in transcription(as well as replication)
o Type I breaks one strand(passes the other of sample duplex), type II breaks both strands of duplex
Replisome
- two polymerases interact with each other in some way: DNA polymerases of leading and lagging strands both move together in the same direction; therefore DNA is looped.
- in lagging strand, finish Okazaki fragment, easier to jump to next site of Okazaki fragments
- strand separation by helicase creates supercoiling
Replication from each origin is bidirectional
- two forks of one double stranded DNA, lagging and leading reversed opposite side
eukaryotic chromosome has multiple origins
- early origin fires early, late origin fires later
End replication problem
- lagging strand’s RNA primer removed, then DNA polymerase can’t normally fill this gap; each replication results in lagging strand shorter => replicative senescence
- leading strand 3'
- telomeres = specialized structure/sequence at chromosome ends
o highly repetitive (GGGTTG)
- telomerase – RNA-dependent DNA polymerase that extends lagging strand template
o adds more repeats enough to allow another Okazaki fragment to be added
- somatic cells don’t use telomerase so they undergo senescence
- tumor cells have telomerase that immortalizes them
Replication fidelity
- DNA polymerase 5’ -> 3’: error rate 1/10^5
- 3’ -> 5’ exonucleolytic proofreading increases fidelity by order of 10^2
- Strand-directed mismatch repair increases fidelity by order of 10^2
- => 1 / 10^9 nucleotides polymerized error rate
- Price of high fidelity is a very slow enzyme
Proofreading - during replication
- DNA polymerase won’t add to a 3’ end of a nucleotide that is not base-paired with template (RNA polymerase doesn’t have this function)
- A – C mis-pair not properly base-paired
- If 3’ end is not properly base-paired, that end will be sent to editing site of DNA polymerase, the false nucleotide clipped off
- The clipped end goes back to polymerization site to attempt replication again
- Mismatch repair (MutS and MutL proteins involved) - after replication, but part of proofreading function
o Recognize distortion in DNA, usually bound by some protein
o Complex finds a nick(differentiates new from old DNA strand) in new DNA strand, when nick found, the segment cut off and DNA synthesis in that segment is repeated
e.g. hereditary nonpolyposis colorectal cancer(HNPCC) results from defective mismatch repair
Positive and negative supercoil
why do we not have DNA primer in replication?
DNA repair and recombination
Damage = chemical alteration in DNA
Mutation = permanent change in DNA
Change in a single nucleotide A/T -> G/C
Insertion/deletion of small number of nucleotide => homopolymer runs, same nucleotide over and over e.g. AAAAAA
Chromosome rearrangements
Insertions/deletion, duplication, inversion, translocation(exchange of pieces between non-homologous chromosomes)
Changes in chromosome number (aneuploides)
3 -> trisomy e.g. Down syndrome (trisomy 21)
Two classes of genes that maintain genome stability
Caretakers – repair genes, act directly on DNA
Gatekeeper – control cell cycle e.g. checkpoint proteins
Major sources of mutation
DNA replication errors – e.g by polymerase, correct by proof-reading and post replicative mismatch repair
DNA damage
- spontaneous: as a result of normal metabolism in cells e.g. oxidative damage
- induced: come from environment, chemical found in tobacco smoke, UV, ionizing radiation
If there’s damage in DNA,
- damage pre-replication repair
a. direct reversal, DNA is directly repaired
b. excision repair, damage is excised from DNA
- replication
c. translesion synthesis – specialized DNA polymerase, large pocket site for repair, downside is that it’s a very sloppy polymerase
d. homologous recombination
- post-replication
e. mismatch repair
DNA damage
Altering base pairing properties, or destroys ability to base-pair
Common miscoding DNA lesions
- Cytosine deanimated to uracil; CG -> T(U)A mutations
- Guanine oxidated by ROS(reactive oxygen species) to 8-oxo guanine which pairs with A or C
Common lesions that block DNA polymerase
- abasic sites (bond between base and sugar breaks)
- UV-induced photoproducts (between pyrimidines)
a. Cyclobutyl dimer formed between covalent Ts, prevents DNA from getting into active site of polymerase
b. 6-4 photoproduct, linkage between pyrimidines, also block DNA polymerases
DNA if not repaired, can be bypassed/tolerated
- homologous recombination
a. undamaged strand of DNA is used as a template to replicate and bypass the lesion; high fidelity
- translesion synthesis, specialized DNA polymerases that come and fill in directly over the lesion; low fidelity
b. normal T, 75 nucleotide, but if thymine dimer present, stops at 44
c. if polymerase eta used, it is able to go right through the damage => bypass is error free
Repair of damaged DNA
- direct reversal: e.g. thymine-thymine(pyrimidine) dimer caused by UV can be broken by photolyase + white light, also enzyme alkyl transferase removes alkyl group from base
- base-excision repair – single base excised
o Glycosylase – each highly specific, recognize specific type of lesion; e.g. UNG – removes uracil, OGG – removes 8-oxoG
§ Cleaves sugar-base linkage, sugar phosphate backbone remains intact
o AP endonuclease(and phosphodiesterase) removes sugar phosphate, change abasic site to single nucleotide gap
o DNA polymerase adds new nucleotides, DNA ligase seals nick
- nucleotide excision repair – oligonucleotide excised
o e.g. pyrimidine dimer removal
o recognizes any gross/bulky type of helix distorting lesion
o binds to the lesion, makes nicks on the strand that contains lesion on both sides, by nuclease
o DNA helicase removes the nicked strand
o DNA polymerase and ligase fill the gap
o Defective nucleotide excision repair -> e.g. xeroderma pigmentosum; XP can also result from absence of translesion synthesis DNA polymerase pol eta
- Homologous recombination/Repair of double strand breaks
- Double strand breaks very detrimental compared to single strand break
- usually accidental e.g. x-ray UV
- physiological, intentional – meiosis
o => meiosis (2n -> 1n): homologous chromosomes segregate, programmed double strand break for cross-over
- loss of nucleotide from degradation from break ends
o ends ligated back together -> non-homologous end-joining
o copying from a homologous duplex molecule -> homologous recombination
- homologous recombination may result in cross-over event
o important in mitotic cells as repair mechanism => may result in loss of heterozygosity during mitotic division by cross-over between sister chromatids => one cell homozygous for normal gene and one cell homozygous for mutant gene
- Mismatch repair (part of replication proofreading function)
o Recognize distortion in DNA, usually bound by some protein
o Complex finds a nick(differentiates new from old DNA strand) in new DNA strand, when nick found, the segment cut off and DNA synthesis in that segment is repeated
Cell cycle checkpoints – allows time for repair of DNA damage
G1 -> G1 checkpoint -> S(Intra S check point) -> G2 -> G2 checkpoint (is all DNA replicated?) -> M -> G1
If damage present, cell cycle arrests
Intra S check point – check for problem during DNA replication
Rad52 mutant -> required for repair of double strand break
Irradiate Rad52 mutant, cell does not divide and dies
Irradiate Rad9 mutant -> cell does not arrest, result microcolony but all dies
RPA(same as SSB) can sense damage in DNA
Send signals to cause response to cells
- can recruit DNA repair pathway
- global transcriptional response
- DNA damage checkpoint, cell arrest to repair DNA
- Too much damage -> apoptosis triggered
remember the syndromes and functions?????????
------------------------------------------------------
LECTURE : GENE AS INFORMATION
e.g. of RNA replicating themselves Dengue, HepD
Ribosomal RNA, tRNA DNA -> RNA and stop
more universal view of information flow??
Linearity of gene
Aminoacyl tRNA synthetase – recognize shape of a.a. side chain, sense the shape of tRNA that has correct anti-codon, and charge the tRNA with a.a. => particular tRNA bound to particular amino acid
Introns – intervening sequences, not found in mature product of mRNA
Exon – anything in mature mRNA
Edges between exon and intron are well defined
300 n exon, while 3,400 n introns
Average protein coding human gene will have 8/9 exons, while 7/8 introns
Sometimes exons may be large
Can intron of one gene can be exon of another?
there's no difference in genetic code between eukaryote and prokaryote, however there maybe code usage bias between types of organisms
Globin gene/ anatomy of small gene
Promoter
Open reading frame – define something that starts with AUG and codes for x number of amino acid
Untranslated regions in mRNA -> part of first and last exons respectively
Definition of 5’ end of exon is made by where the RNA polymerase makes first copy to RNA
5’ UTR - untranslated region -> open reading frame (product begins to be made)
Splice site(SS) – edges between exons and introns
5’ SS – donor, 3’ SS - acceptor
Exons made in capital letters, introns in small letters
3’ UTR – untranslated region of last exon before its 3’ cleavage site
3’ cleavage site that defines end of last exon
Pseudogenes – unknown function
Far distant elements(locus control region, LCR)
Repeated elements in gene cluster
Long interspersed elements (LINE)
Short interspersed elements (SINE)
Most of these belong to Alu family
Very abundant throughout genome
only codon not degenerate -> AUG
TEAM SESSION
watson and crick pairing vs base excision repair vs mismiatch repair vs proofreading - high fidelity
what part of DNA helix is outward? sugar-phosphate backbone
RNA used in bacterial cell wall assembly(UDP, glycan synthesis)
vitamin biosynthesis(nucleotides can be cofactors in the synthesis of vitamin)
cytosine deamination, nitric oxide vs methylation, methylation is more common common
change the transcription factor(DNA binding protein), change the of fate cell -> differentiation cascade
pluripotent - cells that are able to differentiate into many cell types
totipotent - cells that are able to differentiate into all cell types
cell type defined by switch from one pattern of regulated gene expression to another -> different transcription factors that lead to different
asymmetric cell division/distribution of content of cells -> differentiation
DNA major groove and minor groove
One groove, the major groove, is 22 Å wide and the other, the minor groove, is 12 Å wide.
The narrowness of the minor groove means that the edges of the bases are more accessible in the major groove. As a result, proteins like transcription factors that can bind to specific sequences in double-stranded DNA usually make contacts to the sides of the bases exposed in the major groove
double strand break repair
Deactivate a specific residue and see if it affects biochemical activity
- introduce wildtype and mutant type to compare function
- causation than correlation
cancer cells avoid repair pathways
ion exchange chromatography, if pH low to high, elutes negative charged particles first and then gradually toward positive charged particles; positively charged protein bound to beads of column can be eluted by increasing the concentration of sodium chloride or another salt in the buffer, sodium ion will compete with positive charged particles for the bead. Low density positive charged particle will elute first, followed by high density.