Resources

Work flow
This is a high-level overview of what we do in a cutting-edge Genomics lab. It is the story of how we go from tissue sample to sequenced genome.

The Genomics lab contains exhaustive instrumentation. This lab walkthrough was created to help familiarize yourself with the capabilities of our lab.

So how do we go from living organism to a DNA sequence we can actually read? DNA is found inside the nucleus of certain cells. Genomics is the study of an entire strand of DNA, but it is not so simple to get to that point - removing the DNA from the nucleus will not yield perfect strands. Instead DNA is cut up, so we wind up with fragments. These must be sorted and read individually, then reassembled using computer software.
- DNA Extraction
  
  The first step is to collect tissue samples. For zoological specimens, they must be tissued properly. This means freezing in liquid nitrogen upon being collected. By freezing tissue very fast instead of slowly, the ice forms smaller crystals.
  
  The tissues must be stored in a -80°C freezer, of which we have 11. Such low temperature is required to prevent enzymes from breaking down the DNA, as enzymes only function in certain temperatures.
  
  snapper
  If you need to efficiently extract high molecular weight DNA, RNA or protein from fresh tissue samples, with high recovery and high yield, we have a Covaris CryoPrep™.
  
  This device is able to pulverize snap frozen samples contained in amazing aerospace-designed bags to recover high molecular weight DNA and other biological products. Samples traditionally difficult to extract by other methods, but successfully extracted with the CryoPrep™ include plant material with high cellulose content, including seeds. CryoPrep™ cryogenic tissue pulverization works on samples with high pigment content. Other application areas include dry pulverization, hard tissues, tumor, bone, cartilage, cardiac, and skin.
- Duplication and Storage
  PCR
  In the lab we can amplify (replicate many copies) of specific genes using PCR.
  
  PCR is the breaking apart of DNA double helixes (denaturation), the combining of primers to specific parts of a DNA strand (annealing), and the adding of nucleotides to the primer by polymerase (extension), allowing you to double the DNA at a specific location (in our case, a specific gene) in one thermal cycle. By repeating the thermal cycle, you can amplify (create copies exponentially) of specific genes or segments of DNA.
  
  Cloning and Colony Picking
  We can use bacteria to store DNA and copy DNA. Bacteria routinely pass fragments of DNA to each other through conjugation, in a process known as horizontal gene transfer. This is what is used in cloning.
  
  Bacteria are great for sorting and copying our DNA. They contain plasmids. By using an engineered plasmid called pUC18, we can use two important traits: the LacZ gene will express a blue color in the presence of X-gal (a sugar analog of lactose), and an ampicillin-resistant gene. The blunt-ended DNA fragments can be inserted into the LacZ gene, disrupting the blue color. To induce plasmids to insert into bacteria faster, we can use an electroporlator. It uses an electric shot to create tiny holes in the bacteria for the plasmids to enter. Bacteria which take in the DNA are then referred to as "transformed." Only one in 1000 E. coli incorporate our DNA. By using an agar plate with X-gal and ampicillin, we can see which colonies have inserted DNA (intact lacZ genes will express a blue color, indicating they obtained a pUC18 plasmid but did not insert our DNA fragment) and the ampicillin will kill any bacteria which did not insert a plasmid. We can then use a robotic colony picker to create new plates for the white colonies, as white colonies represent bacteria containing our fragments.
  
  After letting the bacteria multiply for a day, we transfer the bacteria into a new 384-well plate and fill each well with a TE buffer. The TE buffer will protect the DNA while we heat up the plate to 95C, allowing us to separate the plasmid from the bacteria by bursting the cell walls.
  
  The plasmids can then be amplified using rolling circle amplification. TempliPhi (an enzyme, free nucleotides and hexamers) is added to the wells.
  
  Storage (BAC/Genomic Libraries)
- Quantifying and Quality of DNA
  Tired of Analyzing bands on gels, and quantifying concentrations? Want to know what exactly is in your sample after shearing? DNA quantification is the answer!
  
  Our uses the AATI Fragment Analyzer which offers a comprehensive platform for DNA qualification that goes beyond a basic concentration analysis. This system is often used with NGS quality controls, gDNA quantification, and RNA research to cleanly and efficiently produce a comprehensive report within hour of analysis.
  
  This system uses a acrylamide gel system that pulls DNA through the capillary array by using a high voltage to separate the fragments based on size. This system can analyze fragments that range from 2-60,000 bp, though there is a system being developed that can separate fragments as large as 120,000 bp.
- Shearing and Size Selection
  
  Shearing
  The Covaris LE220R™ and two Hydroshear™ are instruments which allow us to cut DNA into random fragments. These random fragments will overlap each other. When sequenced, the overlaps can be used for alignment, allowing us to reassemble the many small DNA segments into one large string.
  
  Covaris AFA works using the principle of cavitation, by creating microscopic bubbles. When the bubbles collapse, water rushes in and cuts the DNA. This is the same idea as using depth charges used to sink submarines; water rushing in from a collapsing bubble of air hits the hull and causes the damage.
  
  Hydroshears work by passing DNA samples through a tube, where it is forced through a small aperture in a ruby. This will stretch the DNA, cutting it into consistent sizes. The ends of the DNA will be sticky (have an overhang) and must be ligated using enzymes.
  
  Gel Electrophoresis
  Size Selection
  
  Faster DNA size selection; NGS Library Construction with the Sage Science Pippin Prep™.
  
  Following DNA shearing, we offer advanced fragment size selection using the Sage Science Pippin Prep™. Individual sample channels eliminate sample cross-contamination, and reproducible extractions provide more consistent results. The Pippin Prep™ collects targets between 50bp - 1.5kb, providing narrow fragment size distributions for paired-end sequencing. Minimal low molecular weight contamination reduces wasted reads and ambiguous indel calling. This approach maximizes short read DNA assemblies.
  
  More information: http://www.sagescience.com/products/pippin-prep/
- DNA Sequencing
  Sequencing is any of several methods to read the A, G, T or C arrangement on a strand of DNA. The DNA is sequenced in tiny pieces (50bp-10kb) which are then reassembled on a computer, providing a picture of the entire genome sequence.
  
  first generation
  We use an ABI 3730xl, a type of Sanger sequencer.
  
  Sanger Sequencing, or capillary sequencing, is a way of reading the nucleotides by using chain termination.
  
  dNTPs are the free nucleotides (deoxyribonucleotide triphosphates, or free A, C, G and T nucleotides) dATP, dCTP, dGTP and dTTP. In chain termination we use both dNTPs and ddNTPs (dideoxyribonucleotide triphosphates, or free A, C, G and T nucleotides, where the hydroxyl group on the third carbon is missing an oxygen) together. The polyermase enzyme will recognize the ddNTP is not a correct dNTP, so will cease adding nucleotides at that location.
  
  The dNTPs and ddNTPs are added into four separate containers with a radioactive primer. This allows us to to see only one color to infer the nucleotide.
  
  The ddNTPs will be added in random places along each strand, and by stopping the polymerase action randomly we can run the various DNA fragments out on gels and visualize the chain-terminations to determine the DNA sequence.
  
  Chain termination sequencing is useful as it can read long sequences of DNA (over 800bp) and is very accurate. However, this method of sequencing requires a lot of work and large amount of reagent, making the bases sequenced per dollar expensive - nearly 50,000 times more expensive per base than Illumina sequencing.
  
  For more information on chain termination sequencing, there are YouTube videos which illustrate it in more depth, as well as good information at JGI.
  
  http://www.youtube.com/watch?v=eJ3WvQsPLUk http://www.youtube.com/watch?v=VhH-SJKGAPo
  Also see the 3-part series of shotgun sequencing: part I, part II and part III.
  
  next-gen
  Illumina (150bp) performs sequencing by synthesizing complementary DNA strands. It is another method of chain termination. DNA which has been sheared into small fragments is separated on gel by fragment size. A certain size fragment is selected and amplified through PCR. Bridge amplification is employed to sythesize copies of the DNA on the surface of a flow cell. As each base is added a camera records the fluorescent signal emitted to determine the sequence. This is done in parallel, with 150 million DNA fragments per flow cell. Illumina is currently the most widely-employed technology in DNA sequencing.
  
  PacBio (2-10kb) uses single molecule real time sequencing. It uses polymerase to add phospholinked nucleotides which carry fluorescent labels on the terminal phosphate, which is cleaved away during replication. This fluorscence is recorded and a DNA sequence is discerned. They hope to read an entire genome in under an hour for $100.
  
  454 (700bp) by Roche uses pyrosequencing. Available in the mid 2000s, is similar to Sanger Sequencing but is done in parallel and requires far less preparation. Using emulsion PCR amplificiation, it creates many copies of a DNA strand attached to a bead. By adding a base it can detect how much light is generated in strand synthesis, revealing the nucleotide added.
  Ion Torrent (200bp) by LifeTech is much like a very precise pH meter. It detects changes in the concentration of H+, released while polymerase synthesizes a complementary strand of DNA.
  
  SOLiD (<100bp), also by LifeTech, uses emulsion PCR like Roche, but on beads of varying size. Fluorescence emitted when fragments of DNA are added onto a strand sequence are recorded. SOLiD is very useful for detecting SNPs, insertions and deletions.
  
  next-next gen
  Nabsys (100kb) is poised to use positional sequencing. With current sequencing we have to reassemble a giant puzzle. Positional sequencing can read a length of base pairs and keep track of the location within the genome, greatly simplifying assembly. Instead of uses light to detect base pairs added, Nabsys uses electronic DNA sequencing. By attaching probes to the DNA and pulling them through a nanopore, they can analyze current versus time, creating a genome-length probe map of the DNA. Done in parallel, the genomic sequence can be determined. YouTube
  
  Oxford Nanopore detects ionic currents as molecules move through a nanapore. Measuring the current allows them to identify the molecule moving through, including A, C, G and T on a strand of DNA.
  
  Depixus
DNA
- Bases of life
  
  DNA is used for biological information storage. The DNA backbone is sturdy and reliable, making it an excellent method of storing data safely.
  
  Ploidy is the number of copies of a chromosome contained in a tissue. In humans, somantic cells are diploid (two homologous sets of each chromosome) while gametes (reproductive cells) are haploid (one set of a chromosome). Some organisms contain many sets of chromosomes, such as octoploid strawberries.
  
  Nuclear DNA
  Mitochondrial DNA
  Mitochondria are the power factories of cells. Some cells can contain over 200 mitochondria, and each mitochondrion contains DNA. This mitochondrial DNA, or mtDNA, is inherited maternally. Nuclear DNA and mtDNA are generally considered to be from separate evolutionary origins. Mitochondria were probably once bacteria eaten by eukaryotic organisms. There is no gene recombination in mtDNA, resulting in the same mtDNA being passed from parents to progeny.
  
  In genomics we often work heavily with the mitochondrial genome. Mitochondrial DNA has a faster mutation rate than nuclear DNA, is short, easy to amplify and sequence, and therefore a very important genome for our lab work.
- Human Mitochondrial Genome
  The human mitochondrial genome was first published in 1981 by Anderson et al.:
  
  Anderson S., A.T. Bankier, B. G. Barrell, M. H. L. de Bruijn, A. R. Coulson, J. Drouin, I. C. Eperon, D. P. Nierlich, B. A. Roe, F. Sanger, P.H. Schreier, A. J. H. Smith, R. Staden, and I. G. Young. 1981. Sequence and organization of the human mitochondrial genome. Nature 290:457–465.
  
  The one described here is the one GenBank uses as an example (NC_001807). The mitochondrial genome is circular, nonrecombining, and maternally inherited.
  
  From African (Yoruba) individual published in:
  
  Ingman, M., H. Kaessmann, S. Paabo, and U. Gyllensten. 2000. Mitochondrial genome variation and the origin of modern humans. Nature 408(6813):708-713.
  
  Size: 16571 Nucleotides
  Content: 37 genes
  13 protein coding genes: Range in size 207 to 1812 nucleotides
  2 rRNA genes: Sizes are 954 and 1558 nucleotides
  22 tRNA genes: Range in size is 59 to 75 nucleotides, average size is 69 nucleotides
  Two replication origins: OH and OL for the heavy and light strands
  
  There is one tRNA gene for 18 of the 20 amino acids and two tRNA genes used for Leucine and Serine.
  
  The two strands have a difference in guanine (G) content, called Strand Bias, with the Light Strand being guanine (G) deficient, which typically among vertebrates contains 11-14% G. The complementary strand is termed the Heavy Strand. This does not hold outside of vertebrates.
  
  Most genes are encoded on the heavy strand. Of the protein coding genes only ND6 is light strand encoded. Both rRNA genes are heavy strand transcribed. Of the 22 tRNA genes only 8 are light strand transcribed.
  
  The two DNA strands of the mitochondrial genome are Transcribed into RNA individually, each strand as a single transcript, which is subsequently cleaved. It is important to note that genes encoded or transcribed on the same strand cannot overlap with the following exception.
  
  Among protein coding regions, the two ATPase (ATP6 and ATP8) genes are bicistronically encoded, as well as the ND4 and ND4L subunits of NADH dehydrogenase. These genes when transcribed into RNA from the same strand do have overlapping regions.
  
  This results in a unique situation of protein coding stop codons among vertebrates. Typical full stop codons are “TAA, AGA and AGG”. When tRNA genes are transcribed on opposite strands with other tRNA genes or protein coding genes they often overlap. When tRNA genes are transcribed on the same strand as a protein gene they cannot overlap and tRNA genes often punctuate the “Stop” of a protein transcript following cleavage as first described by Ojala et al. in 1981:
  
  Ojala D, J. Montoya, and G. Attardi. 1981. tRNA punctuation model of RNA processing in human mitochondria. Nature 290:470–474.
  RNAs are in general polyadenylated following cleavage with a string of “A” bases at the end. What results in mitochondrial-cleaved transcripts are termed Partial Stop Codons, from processing of the primary whole mitochondrial-strand transcript into smaller transcripts of RNA that have undergone polyadenylation. Hence:
  
  T = TAA
  
  TA = TAA
  
  AG = AGA
  
  If there is a protein-coding gene that ends in a “T” and the adjacent tRNA transcribed on the same strand starts with an “AA” this is not the same gene with a full stop of “TAA” in the protein coding segment. I outline this in detail because it is the number one problem in gene annotation (gene boundaries) among mitochondrial files in GenBank.
  
  In addition, it has been noted, at least among vertebrates (Boore et al., 2005), that mitochondrial protein coding genes can start with the codons of either “NTG” or “ATN” (N= G, A, T, or C) which are probably post-transcriptionally modified or read as fMet. This results in six alternative start codons from the typical “ATG” start.
  
  Boore, J. L., J. R. Macey, and M. Medina. 2005. Whole mitochondrial genome sequencing and gene order comparisons of animals. In Molecular Evolutiion: Producing the Biochemical Data, Part B, E. A. Zimmer and E. Roalson (eds.), Methods in Enzymology 395:311-348.
- What does DNA do?
  
  Proteins and amino acids
  Proteins are made of 20 Amino Acids, which have very different properties. The charge (negative, neutral or negative) often depends on the PH environment.
  
  http://www.med.unibs.it/~marchesi/aacids.html
  
  http://www.mcb.ucdavis.edu/courses/bis102/AAProp.html
  
  Introns, exons
  http://en.wikipedia.org/wiki/DNA_codon_table
  http://en.wikipedia.org/wiki/Genetic_code#RNA_codon_table
- Replication
  
  DNA Replication
  
  In the lab we can amplify (replicate many copies) of specific genes using PCR, or clone many segments of DNA using bacteria. However, this overview will demonstrate how DNA is replicated in a cell.
  
  DNA replication begins with a double helix of DNA.
  
  DNA Helicase
  
  Formation of a replication fork: A protein called Helicase breaks the hydrogen bonds of the two strands of DNA forming a replication fork.
  
  Binding Proteins
  
  Stabilization of a replication fork: Binding proteins keep the two DNA strands apart, preserving the replication fork.
  
  Primase
  
  Primase forms RNA Primers: The enzyme primase makes a short segment of RNA called the RNA Primer on the DNA termed the template DNA.
  
  Polymerase
  
  Formation of new DNA Strands: An enzyme called DNA polymerase adds DNA nucleotides (individual A, G, T, C) to the RNA primer on the template DNA.
  
  Continuation of DNA synthesis: DNA strand synthesis continues in a 5' to 3' direction with the new strand termed the nascent strand.
  
  Proofreading checks DNA bases: DNA polymerase proceeds along the nascent leading strand continuously, proofreading DNA nucleotides and replacing incorrect DNA bases
  
  Formation of continous DNA: RNA primers are removed by proteins, and a protein called Ligase fuses the sugar-phosphate backbone of the nascent lagging DNA strand
  
  Bidirectional synthesis
  
  Replication proceeds in two local directions
  
  Okazaki fragments form: DNA polymerase procedes along the nascent leading strand continuously. Okazaki fragments form in about 150 bp stretches along the nascent lagging strand, fused together by ligase.
  
  Summary of DNA Replication
  A protein called Helicase breaks the Hydrogen Bonds of the two strands of DNA forming a Replication Fork.
  
  Proteins called Binding Proteins keep the two DNA strands apart preserving the Replication Fork.
  
  A protein called Primase makes a short segment of RNA called the RNA Primer on the DNA termed the Template DNA.
  
  A protein called DNA Polymerase adds DNA Nucleotides to the RNA Primer on the Template DNA.
  
  DNA Polymerase Proofreads DNA Nucleotides and replaces incorrect DNA bases.
  
  DNA Strand Synthesis continues in a 5' to 3' direction with the new strand termed the Nascent Strand.
  
  Okazaki Fragments form in about150 DNA base stretches along the 5' to 3' Template DNA.
  
  RNA Primers are removed by Proteins and a protein called Ligase fuses Sugar-phosphate Backbone of the Nascent DNA Strand.
  
  Mitosis and Meiosis
  Somatic cells (non-reproductive cells) contain a characteristic number of chromosomes. For mitosis to occur, during the S-phase of a cell's lifecycle duplicates the chromosomes to create sister chromatids.
  
  mtDNA Replication
- Reproduction
  Meiosis
  QTL (Quantitative trait loci) - markers on genes
- Evolution
  Carl Woese
  Protobionts, nucleic acids
- Cloning and Storage
  Genomic libraries
  
  We use DNA in biological research, such as J Robert Macey's work with reptiles and amphibians. Specimens are collected and tissue samples are taken, snap-frozen in LN2 (liquid nitrogen), then stored in our -80°C freezers.
  
  When we are ready to work with our samples we extract the DNA. We can then amplify with PCR to create many copies of a specific gene. However, there are other ways we can work with the DNA to create an entire genomic library, such as molecular cloning using bacteria.
  
  Libraries are fragments of DNA which are cloned and stored using bacteria. The clones vary in size:
  
  BAC and Fosmid clones are large-insert clones, while plasmids are small-insert clones. Using the clones of various sizes, we can assemble an entire genome with utilizing paired-end reads.
  
  BAC Clone [150kb]
  We can store our DNA inside of bacterial artificial chromosomes, or BAC libraries. Plasmids are small, circular, double-stranded segments of DNA that are not part of the bacterial chromosome. They are transmitted from one bacteria to others through conjugation, resulting in horizontal gene transfer. This allows advantageous genes to propogate. Molecular cloning utilizes this same process.
  
  Fosmid Clone [40kb]
  Fosmids are used for creating mini-BAC libraries of specific chromosomes.
  
  Plasmid Clone [16, 8, 4kb]
  Plasmid clones are small-insert clones used to create contigs.
  
  Assembly
  
  Contigs are sets of overlapping DNA fragments which, when combined together, create a complete sequence of DNA for a particular region of the genome.
  
  Contigs assembly is assisted with paired-end reads from inserts of known size in plasmid clones.
  
  Shotgun sequencing
  
  Shotgun sequencing is a strategy for assembly of random segments in a genome.
  
  Contig assemblies assisted with paired-end readings are used to create a scaffold based on paired-end readings.
  
  Parsimony
  
  Analyzing the assembled genome, we can generate trees to determine how species are related.
PROTOCOLS
- About
  Protocols
  This resource is a compilation of all references for running our lab instrumentation. These protocols are the primary resource for our lab personell, affiliates, interns and students.
  
  For a high-level introduction to the lab and its instrumentation, instead refer to "the lab" under Workflow.
  
  Good Laboratory Practices in Genomics
  Before entering the lab you must first sign in and learn about our general protocols. These are covered in-depth in our course BIOSC 32: Good Laboratory Practices. This will cover general contamination, along with proper practices involving gloves, laboratory notebooks, pippettes and other laboratory equipment.
  
  Good lab practices must be observed whenever you are in the lab. We wear gloves not only to protect ourselves from reagents, but also to protect the samples from ourselves.
  
  Contamination is crucial in the genomics lab, as even one small piece of errant DNA can be amplified tens of billions of times. You must watch what walls and surfaces you touch when ungloved, as you can contaminate them with your own DNA.
  
  Gloves
  At all times you must be aware of contamination. Even when pulling gloves from the shelves, contamination is an issue. Always be mindful of how you handle gloves; only pick them up and put them on while touching the heel of the glove. If you are gloved and touch your pocket or check your cellphone, your gloves will pick up DNA from the surfaces and become contaminated. If you are ever unsure, replace your gloves.
  
  Lab Notebooks
  When doing any experiments in the lab, at least one member of the group must maintain a laboratory notebook, recording every action and all results. Do not erase anything in your laboratory notebook. If you make a mistake, cross out the mistake with a single line so it is still legible. This allows us to reference the laboratory notebook if our results are not what we expected, to try and determine a cause.
  
  Pipettes
  Pippettes must only be touched with gloved hands. Learning proper volume adjustment of P1000, P200 and P20 along with which tips belong to them is necessary. For the P1000, never adjust beyond 1000μL [100 on the meter], for the P200 never adjust beyond 200μL [200 on the meter] and for the P20 never adjust beyond 20μL [20.0 on the meter]. Use the appropriate tool for the job: for dosing out 15μL, use the P20 instead of the P200 as it will be more precise. Never hold the pippette upside down, as it can contaminate the inside. When pulling in solution, do not release too fast or the solution can turn into aerosol and contaminate the inside. Always change your tips between every pippetting job.
- DNA Extraction
  DNA Extraction Resources:
  DNA Extraction Brochure [PDF] supplied by Bindiya Patel from Macherey-Nagel
  DNA extraction kit from Bio-Rad
  
  DNA extracted and precipitated from a banana using household ingredients. Strawberries make good subjects for at-home DNA extraction. They are octoploid, meaning they have 8 copies of their DNA. When the DNA is precipitated, having such a large quanity makes the DNA easy to see with the naked eye. We also extracted DNA from kiwi and banana.
  
  You can download the instructions [.doc] and do this experiment at home.
  
  DNA extraction kit from Macherey-Nagel
  Macherey-Nagel provided us with DNA extraction kits for this course. The kits allowed us to digest tissues and wash away the cellular contents, enabling the isolation and purification of DNA.
  
  Prep: The DNA is a macromolecule stored inside of the mitochondria and nuclei of cells, which are organized into tissues. The first step of accessing the DNA is to dissect our salmanders to collect a sample of tissue. We used the liver of the coastal giant salamander, Dicamptodon ensatus, and the rough-skinned newt, Taricha granulosa. We chose the liver as liver cells contain a large number of mitochondria, as many as 200 mitochondria per cell.
  
  Digest: We start to digest the tissue with enzymes, breaking it apart.
  
  Lyse: to access the DNA, we have to open up the cells and then nuclei and mitochondria by lysing them.
  
  Adjust: we adjust the DNA binding conditions by adding ethanol.
  
  Bind: We then pipette the contents of our eppendorf tube into a column over another eppendorf tube. The column has silica beads which the negatively-charged DNA will stick to.
  
  Wash: Washing the column with ethanol will wash most of the pigments and cellular contents away. We centrifuge to push the ethanol and its solvated contents through the column.
  
  Wash x2: We move the column into a new eppendorf tube and wash and centrifuge again to remove more pigment and cellular contents.
  
  Dry: centrifuge to remove remaining ethanol.
  
  Elute: after moving the column to a new eppendorf tube, dH2O is poured into the column washing the DNA with it.
  
  Second elution: We performed a second elution to see how much DNA was remaining from the first. This elution contained even purer DNA, as the first elution still washed some pigments and cellular contents.
  
  DNA extraction kit from Bio-Rad
  We also used some DNA extraction kits from Bio-Rad to extract DNA from Arabidopsis. These kits followed mostly the same procedure.
  
  One large difference was that the cellular walls of the plants provided a bigger barrier, requiring us to finely mince our samples. Then a proprietary salt solution was used to open the cells instead of digesting them with enzymes.
  
  DNA Precipitation
  We can create pellets of DNA.
  
  DNA Extraction review:
  Why did we have to digest and lyse the tissue sample?
  Why do we have to use ethanol for the wash, and dH2O for the elution?
  DNA Extraction review:
  The DNA is inaccessible until the cells and organelles are opened, spilling out the DNA.
  The ethanol will wash away most pigments and cellular contents off the column, without washing the DNA. dH2O is necessary as the exposed protons (hydrogen bonds) attract the negatively-charged DNA and pull it off the silica beads.
  QuestionsAnswers
- PCR
  Polymerase Chain Reaction (PCR) Resources:
  Spreadsheet for PCR and gel electrophoresis calculations
  YouTube has many great videos about PCR
  
  Dustin DeMeo demonstrating how to run the PCR machine
  
  Polymerase Chain Reaction (PCR)
  PCR is the breaking apart of DNA double helixes (denaturation), the combining of primers to specific parts of a DNA strand (annealing), and the adding of nucleotides to the primer by polymerase (extension), allowing you to double the DNA at a specific location (in our case, a specific gene) in one thermal cycle.
  
  1: Denaturation is the breaking of the hydrogen bonds that hold DNA strands together, forming the double helix. Hydrogen bonds are very strong, an electronegative element covalently bonded to a hydrogen will 'hog' the electrons, creating a dipole. The exposed proton (hydrogen) has a strong attractive force to electronegative elements. This can be broken with heat. We use 98ºC as it's enough heat to break the bonds, but low enough temperature that the enzyme (polymerase) won't denature.
  
  2: Annealing is the act of joining a primer to the DNA strand. A primer is a synthetically-derived string of nucleotides (A, G, T, C bases) can be used as a template. At an appropriate temperature, that template will only fit onto the DNA strand at the specific sequence of nucleotides. We can bond, or anneal, the primer to a sequence of nucleotides just before a gene, allowing us to amplify (copy) that gene. Annealing is an art, not a science, as different temperatures will permit the annealing to occur at innappropriate sites. Annealing typically occurs between 45º and 60ºC. If the temperature is too cold, the primer can stick anywhere. If the temperature is too hot, it won't stick at all. We can predict appropriate temperatures based on the sequence, as A-T bonds contain two hydrogen bonds, and G-C contain three hydrogen bonds. This gives us information about how much energy is required to make the primer stick to the original DNA strand.
  
  3a: Extension (short) is the lengthening of our primer. After we make the primer stick, we can use an enyzme that will walk along the bonded DNA and primer, adding appropriate nucleotides to the primer. DNA has direction, much as humans understand left and right. DNA is extended (nucleotides are added) in the 5' to 3' direction (five prime to three prime). Enzymes require very specific temperatures to function, and our DNA polymerase works at a predictable speed at 72ºC.
  
  3b: Extension (long) is when we let our polymerase add enzymes for longer than 30-60 seconds. We can end up with copies of DNA 10,000 base pairs long or longer.
  
  By using specific sequences in the DNA, we can anneal our primers to target points, allowing us to amplify a specific gene.
  
  PCR generally goes for 25-35 cycles. You can figure out how many copies you have made using 2n where n is the number of cycles. For this class we used 35 cycles, yielding a calculation of 235 (34 billion) copies of each original strand of DNA (at the site of the gene selected by the primer). The cycles are continuous, in the above graph we have isolated a single thermal cycle.
  
  Extension can last for variable time. The polymerase enzyme we used adds about 1kp (1000 base pairs, ie. the A, G, T and C nucleotides) per 30 seconds. Thus by letting the cycle run for 5 minutes, we are creating 10kb fragments of DNA. Manipulating the time for extension allows us to customize our gene fragment lengths.
  
  In addition to adding the DNA to the PCR plate well, we have to make a small broth of chemicals (PCR mix) to make our PCR function.
  
  Buffer: provides a suitable environment (pH, etc.) for PCR.
  
  MgCl2: polymerase requires specific orientation of the nucleotides during extension, the Mg2+ ion helps
  
  dNTP: the A, G, T and C nucleotides which polymerase can grab to extend the primer
  
  Primer 1: A sequence (in this case, 20bp long) that attaches just before a gene
  
  Primer 2: A sequence (in this case, 20bp long) that attaches right after a gene.
  
  Polymerase: The enzyme which attaches dNPTs to the primer
  
  DNA template: The strand of denatured DNA we extracted from an organism, which the primer attaches to, allowing us to extend in the 5' to 3' direction creating a complementary DNA strand
  
  PCR review questions
  Hydrogen bonds break at high temperatures. Why did we choose 98ºC for denaturation instead of a higher temperature?
  Why is there a range of 45-60ºC for annealing? Would all primers anneal at the same temperature? What would happen to the primer if you used a temperature that was too low, or too hot?
  Why does extension have such a specific temperature of 72ºC? Why can we allow variable times for extension?
  What is in our PCR mix?
  Why did we use two primers?
  What is long PCR, and how does it differ from normal PCR?
  
  Enzymes are made of proteins. Proteins denature (change their shape) when the temperature is too high. One limitation in PCR is finding enzymes that can withstand high temperatures without denaturing. The most temperature-tolerant polymerase discovered so far is from the sulfur bacteria found around hydrothermal vents at the fissures in the bottom of the ocean.
  Annealing the primer to the correct sequence depends on the A-T bonds having two hydrogen bonds and the C-G bonds having three hydrogen bonds. If the temperature is too low, it will be non-specific, meanning the primer can stick anywhere. If the temperature is too hot, the primer may not stick at all. Finding just the right temperature means the primer will only stick to the correct sequence of DNA. This often requires many different runs of PCR to find the right temperature for a particular primer, though we can predict an approximate temperature by calculating the number of hydrogen bonds in the nucleotide (A, G, T and C) sequence.
  Enzymes work in very specific temperatures. At 72ºC we know our enzyme will not only work, but will be adding nucleotides at a very predictable speed. Knowing the speed allows us to customize how long we want our fragments to be.
  Buffer, MgCl2 or similar salt, dNTP, primer 1, primer 2, polymerase.
  DNA is read along both strands. We needed a primer to go in the 5' to 3' direction for both strands. That way we get the DNA from the start all the way to the end of a gene.
  Long PCR is when we allow the extension phase of 72ºC to last for longer than 30 seconds. By making it last 5 minutes, we extended the nucleotide sequence to 10,000 base pairs for a long fragment of DNA.
  QuestionsAnswers
- Gel Electrophoresis
  Gel Electrophoresis Resources:
  Gel electrophoresis made simple using skittles [.doc]
  Spreadsheet for PCR and gel electrophoresis calculations
  
  Gel electrophoresis is a way of separating molecules by size. DNA fragments are negatively charged, and by creating a charge gradient the DNA molecules will move in the direction of positive charge. By adding an agarose gel, we can allow smaller molecules to move faster and bigger molecules to move slower.
  
  A well is filled with a ladder, a reference material of known size. For our class we used ladders of 100bp or 1kb. The 100bp ladders contained molecules 100bp, 200bp, 300bp, 400bp and 500bp long. These move towards the positive end just like our DNA samples, creating a reference point of the size of our PCR products. This way we can make sure we have the right gene, as the gene will be of known length.
  
  A 1% agarose gel is made by combining 0.5g agarose with 1 L of water, heating in a microwave until it just begins to boil, once the solution stops steaming 1 mL ethidium bromide (to stain) is added. While still very hot, this solution is poured into the mold inside a gel rig, and allowed to set. A comb is added, which creates wells (holes) for us to inject our ladder and 5 samples into. This will have six total lanes of molecules moving towards the positive charge.
  
  The gel a few minutes after it begins to run
  
  We also need to create 1 L of 1x TAE (TRIS Acetate EDTA) buffer from a 50x concentrated solution. We know that (concentration1)×(volume1) = (concentration2)×(volume2), so we plug in our values of (50)×(X mL) = (1)×(1000mL) and solve for X mL, giving us 20 mL of 50x buffer needed to make a 1x solution. The 20 mL subtracted from our final solution volume of 1000 mL means we need 980 mL of distilled H2O.
  
  Adding a little buffer, we can remove our comb. The gel rig mold is then aligned, and the rig is filled with our 1x buffer solution. We add the ladder to the sixth well on the right, then the remaining five wells are filled with our PCR products.
  
  A current is then applied, and after 1 hour and 15 minutes we have seperated our genes created in PCR and can compare them to the reference ladder for size, ensuring we have the right genes.
  
  Imaging
  
  We can remove our gels, photograph them, and print images for our lab notebooks.
  
  Gel Electrophoresis review:
  Why do we run our PCR products out on gels
  What is the purpose of the ethidium bromide?
  Gel Electrophoresis review:
  To analyze them. We can see if our fragments are the size we wanted.
  Ethidum bromide is a fluorescent tag for the DNA. When exposed to ultraviolet light it fluoresces.
  QuestionsAnswers
Glossary
Glossary:

Allele: alternate form of a gene

Annealing: the attachment of a new single-stranded nucleotide segment to an additional previously single-stranded nucleotide segment; the second step in PCR where a primer attaches to a template strand for replication which is done in a wide range of temperatures typically from 45-60C

Anode: the negative pole of an electrophoresis system (black cord); note DNA is negatively charged and will run to the cathode (positive, red cord)

Bicistron: an RNA cleaved product that is further translated into two protein segments

Cathode: the positive pole of an electrophoresis system (red cord); note DNA is negatively charged and will run to the cathode and away from the anode (negative, black cord)

Cistron: the smallest DNA segment encoding an RNA; this can refer to the smallest retained segments following cleavage of a larger RNA transcript

Denature (denaturation): the separation of segmented strands of biological molecules; often referred to as the separation of a double-stranded DNA molecule into two separate single strands as the first step in PCR; also referred to as the unfolding of a protein

Diploid: two sets of chromosomes in a cell

Electrophoresis: the separation of molecules (ie. DNA or protein) across an electric field through a matrix such as a gel or capillary; this is run from anode (negative, black cord) to cathode (positive, red cord). DNA is negatively charged and will run to the cathode (positive). Proteins maybe positive, neutral or negative in charge, hence their migration is uncertain and changes with PH

Encoded region: a DNA segment transcribed into RNA

Extension: the step where individual nucleotides are added to the 3’-end of a primer to copy, in complement, a single strand of DNA into either a new strand of DNA or RNA; this is the third step in PCR which is typically done at 70-72C

Haploid: one set of chromosomes in a cell

Haplotype: a set of linked gene versions (alleles) that tend to be inherited together, either on a chromosome which are not separated by recombination or a mitochondrial genome which is maternally inherited and not subject to recombination

Heteroplasmic: two or more versions of the mitochondrial genome are found in an individual organism

Heterozygous (Heterozygote): an individual with two different versions (allele) of a specific gene (locus)

Homozygous (Homozygote): an individual with a single version (allele) of a specific gene (locus)

Locus: a DNA region that corresponds to a gene

Loci: more than one locus (gene)

Oligo: short term for oligonucleotide or primer

Oligonucleotide: short DNA molecule typically artificially synthesized for the Polymerase Chain Reaction (PCR) often called a primer

Polymerase Chain Reaction (PCR): a method of using heat cycles with a mixture of chemicals, nucleotides including primers, and enzymes to produce very large numbers of copies of a generally targeted nucleotide segment, often a gene region of DNA

Polypeptide: a protein made of amino acids = protein

Primer: short DNA molecule typically artificially synthesized for the Polymerase Chain Reaction (PCR) often called an oligo or oligonucleotide; in natural replication a short RNA molecule; both to initiate 3’ extension of a nascent (new) strand of DNA

Protein: a polypeptide made of amino acids = polypeptide

Transcript: the single stranded RNA complement product of a DNA segment from transcription

Transcription (transcribed): the process of producing a single stranded RNA molecule from a DNA complement segment.

Translation: the process of reading a messenger RNA (mRNA) nucleotide sequence into an amino acid sequence of a polypeptide or protein.

Transition: a mutation changing a purine to a purine (A,G) or a pyrimidine to a pyrimidine (C,T,U)

Transversion: a mutation changing a purine (A,G) to a pyrimidine (C,T,U) or a pyrimidine to a purine

heavy strand, light strand, nascent strand, homologous, ploidy, active sites, adenylation, polyadenylated, regulatory elements, SNP, point mutation

Additional resources
Human Genomic Glossary
NIH Glossary with Citation to original papers
Additional Resources
Databases and utilities
GenBank is the premiere genetic sequence database run by the NIH. It contains annoted DNA sequences that are available to the public.

BLAST, the Basic Local Alignment Search Tool, finds similarities between sequences.

PubMed is a very useful search tool for finding peer-reviewed journal articles in health, medicine and science. You can also find journal articles with other search engines, such as Google Scholar.

Berkeley Public Library allows you to apply for a card online which you can then use to access journal databases.