W: Exam 12 - 13Protein, 14recDNA and 15NCBIpg.1-9 Vids 107-120 - review 117-120!!!

¡Supera tus tareas y exámenes ahora con Quizwiz!

restriction endonucleases

- recognition site usually longer than overhang (which is made due to sticky end producing enzyme) sticky ends stick together bc one 5' end is complementary to 3' end - blunt ends: cut sites are exactly opposite one another - ligate poorly bc no extra stickiness, just hit together right way, rare Sau3AI: cuts on outside of every: 5'-GATC-3' 3'-CTAG-5' that you can find, and bc it cuts on outside of rec site, sticky end is same as recognition site BAMHI - cut occurs in between 2 bases, sticky end is only 4 bases (look at end), even tho recognition seq is 6, creating sticky end sticky = how many bases extra from split site (not inc it, dont count the one(s) BEFORE)

beta protein domain - two immunoglobulin domains: BCAM

- recognize/bind to cells, involved in cell adhesion - many antiparallel beta strands - 1 small alpha helix

NM mode: preferred way to get NM DNA to fastA file format

- select whole seq - send to: save as FASTA file "create file" - set so if it sees .fasta open in textedit, can copy header, other material - can copy paste header, can just grab some material

Continue your analysis of the primer from the previous question, but this time see if this primer will form a primer dimer with the other primer in the pair, which is: ATGCTATGATTCTGCG What is the DeltaG of the hetero-dimer prediction that we should most worry about?

-3.61

Analyze this primer candidate on the IDT website with their OligoAnalyzer tool: AAACATCGAGAGATC What is the DeltaG of the self-dimer prediction that we should most worry about?

-4.62

cloning basics

goal: to get DNA fragments into plasmid get 1 type of bacteria with 1 type of plasmid with 1 type of DNA fragment (clone), need to: 1) create plasmid prep, just grow out E.coli containing plasmids and extract DNA 2) obtain DNA fragments from PCR OR digest genomic DNA to make a library 3) cut DNA fragment/plasmid with restriction digest/enzyme and put in gel, cut bands, purify DNA bands for purity 4) ligate with DNA ligase at restriction enzyme junction, create connected plasmid/dnafrag 5) transformation of E.coli competent cells (ready to receive DNA), shocked so they can take it up 6) plate transformed E.coli cells onto selective medium - only ones with plasmid grow 7) plasmid preps again of several colonies 8) test insertion by restriction digest/PCR

gene is in

green

Intrinsically Disordered Proteins......

have large sections of protein sequence which can form many alternate structures

tooltip

hovering over an mRNA/protein section, window with a lot of indo - mRNA, title, location, where you can download it through GENEID, BLAST

gene model rendering

how you organize the info, with/without introns, gene bar

in "gene" menu of gene bank record

if gene = RPL21P4, means ribosomal protein L21 and pseudogene 4, which are nonfunctional proteins, scattered across the genome, psuedos are often in introns of other genes, which is where this one landed - click gene id: should be hashed, going in opposite direction of BRCA1

within gene id click of promoter tab (gene bank record)

in eukaryotes: - shared regulatory region of BRCA1 gene and antisense transcript - talks about silence elements and all the promoter does (birectional or not), has an interactive picture of it - if bidirectional, promotes transcription of RNAs on either side!! - red elements = promoter elements - silencers = turn off promotors, negative regulation - enhancer = positive regulation (showing both positive and negative regulation of transcription) (proto-oncogene: normal gene but mutated/overexpressed, long non-coding RNA listed as antisense gene which acts to turn off gene)

For the gene above, staying in GeneID, where is this gene expressed the most? The least?

kidney, pancrease

lncRNA

long non-coding RNA (doenst code for anything, modulates something)

For this Arabidopsis gene, STP1, in which tissue is it expressed the highest? [Hint: look at the plant drawings in Araport]

mature leaf

right panel of NCBI main page

most popular tools used to analyze seq (BLAST/genome)

Look up the entry for human multimerin 1 (MMRN1). Choose the RefSeq transcripts option. What is the Locus ID for Transcript Variant 1 and what is the length of the transcript?

not: NM_007351, 5001 bases

Go to the Genome Data Viewer for BRCA1 and zoom in on the promoter region just upstream of the BRCA1 transcript. Make sure you look upstream and not downstream! Which of the following is a transcription factor protein that binds to one of the promoter elements listed in the promoter for this gene?

not: Ppr23

What kind of enzyme is coded for by the gene located at 42,124,979 on human Chromosome 17? [Hint: Hover over the gene and click GeneID in the Tooltip box to find this information]

not: serine kinase

A Greek key motif is a beta sheet with:

one of the loops connecting the beta strands not being a hairpin loop

hasihing around RNA (purple)

pseudogene

RNA is in

purple

coding regions is in

red

know how to fill in given 3 bases

reverse of opposite G-C, C-G, A-T, T-A 1) opposite 2) flip GAA reverse = CTT flip = TTC

tooltip over gene itself

right click on gene name click: view GeneID (pulls up full report)

in genome browser

see part of human genome containing the gene searched - tells you chromosome number, position with base pairs - can see gene itself and all mRNAS (NM) + proteins (NP) + all different variants recorded for gene - solid buttons are exons (just not zoomed in), introns between them - taller mid line behind exon = mini exons, 2 short lines between = introns - arrows = read from that direction (3' left) - can drag labels around from top to bottom of colored lines - clinical variation = shows disease area + mutations in dark blue - RNA seq data last in blue bars - many forms too

ApE - Restriction Enzyme Analysis

to find enzymes that cut twice: - make SURE linear is on top right of gene click to change - graphic map + U: shows annotations next to unique restriction enzyme sites, same spot = same name, shows seq if you hover over, usually blunt vs sticky end cut A^.... = 5', .....^A= 3' - show all enzymes: enzyme selector, can pick restriction enzyme, shows # of cuts - seq, compatible (overhangs), unique + six cutter, (no >) and select, have six base recognition site -

NM mode: version vs variant

variant(#) = variations of a seq, diff each time - BIOLOGICAL, lack portions of UTR, coding region, missing codons version(#) = iterations of same seq

how to examine a gene bank record

1) brca1 gene search, hit genome browser 2) right click on gene title and click genbank view in pop up menu 3) on genebank page, click pull down menu for format *chromosome name in title - NC mode is whole chromosome, looking only at position slice # with dots in between which is XXX bp (on top) *can see journals cited/references, authors 4) features area: source, gene, mRNA, codons, translated form * mRNA annotations/complement(join(#....#....#....) skips many INTRONS (,), 2# ranges are 1 exon! * each mRNA variant has "transcript variant #1-6) * CDS: lists of exons with .... (codons for mRNA), 1 of mRNA is 5'UTR region, codons can start at a later position and end at an earlier position (3'UTR) 5) some translated isoforms (in CDS) may be longer or shorter than others due to exon skipping (ribosomal protein usually in introns, psuedogene)

find gene info shortcut

1) name of gene in search on mainpage (space) 2) name of organism its found in more specific so less number results

NCBI Navigation 1)

1) type name of gene in main page search bar, will see results by database menu, where number by gene is HITS in blue circle, for example, does not mean 19,000 genes, just how many gene entries have been deposited (variants) * # by protein: protein excession * # pubmed central: # of papers published

Open up the file "EngD cellulase" in ApE. What is the length of the major ORF?

1548

There is a unique HindIII site located in the DNA of "EngD cellulase". At which base does this HindIII site start?

1704

RefSeq listings

19,000 gene entries based on 1 gene, the 1 RefSeq represents them all

For human gene TMEM106A, on which exon is found the translation start site? [There are 4 variant mRNAs, and one is an oddball, so give me the answer that works for 3 out of the 4 variant mRNAs]

3

In "EngD cellulase", how many times does the restriction enzyme NdeI cut this sequence?

3

Look up the entry for ERBB2 and go to the RefSeqGene version. At the GenBank page, click the Graphics hyperlink. How many exons does this gene have? [Hover over black exons to get the numbers]

32

BAMH1 and Sau3AI

5' always LOT, ROB even if strand snipped BAMHI: 5' sticky ends, end is on left in upper, right in bottom, ALWAYS, so if bases/overhang are sticking out from top left or bottom right: 5' Pst1: 3' sticky ends, end is on right in upper, left in bottom, ALWAYS, so if bases/overhang are sticking out from top right or bottom left: 3' blunt: cut is in same site in both top and bottom of rec seq, not offset

What is the molecular weight of the protein coded by the major ORF in the file "EngD cellulase"? [hint: try translating it]

56.0 kDa

NCBI find a gene

? button (help), legends, "graphical view legend"

to navigate genome

? on upper right hand corner, help, "navigating the seq viewer" under general section - pan arrows: go L/R (shows additional gene neighbors) can be dragged - zoom slider - A&T symbol: zoom to the sequence level (ATTG) - zoom to range - undo: go back

Here is the first three bases of the recognition site of a standard 6-base restriction enzyme: GAT. What are the next three bases on this strand?

ATC

What is one way to prevent getting your insert DNA put in backwards in the insertion site of the plasmid?

Cut the insert with two different restriction enzymes and clone into a multiple cloning site

On the NCBI Main Page, look up the Protein entry for mannan-binding lectin serine protease 1 isoform 2 precursor [Homo sapiens]; verify the Accession number: NP_624302. Go to the GenPept page. There is listed an article with the lead author Oroszlan in the Journal of Immunology. Look up the article with the PubMed link. What are the first words of the first sentence in the abstract?

Factor D (FD) is an essential element

NM vs NP

NM = mRNA = purple NP = protein = green

ApE - ORF Mapping

ORF map: 1,2,3 -1,-2,-3 (top 3 are the only valid ones for a + RNA virus) jump from 3 to 1 if you have spaces see vertical half line: start Met full line: stop cluster of short ones: mathematical, short, random, biological: starts space stop

genetics of cloning

PUC plasmids have: - ori: origin of replication - genetic segment recognized and replicated by DNA polymerase of ecoli (all circular DNA w/ori will be rep) - AMP: ampicillin resistance gene, ensures only ECOLI containing plasmid (w/DNA and AMP resistance gene) will grow on ampicillin bc TRANSFORMATION is inefficient - multiple cloning sites (MCS) prevent backwards insertions (gene/DNA of interest going into cloning site could flip 180*), clone fragment w/ Bam + Pstl vs Bam and Bam to cut plasmid and insert - xgal: also need to ensure all PLASMID that have AMP-res gene have gene/DNA frag insert bc LIGATION is inefficient, beta galactosidase in DNA will split xgal in gel into blue xgal colored thing, * intact = blue bc B-galactosidase is active * ruptured gene bc of DNA insert = no B=gal produced, no color made, GOOD - you WANT grown (AMP) but also white (X-gal) colonies

curators of protein classification

Pfam - main curator SCOP - structural classification of proteins, manual curation of protein classification CATH - class/architecture/topology/homology

reading a genebank record through NM mode (MRNA!!!)

REFERENCE SEQUENCE - seq version which replaces previous mRNAs in genome browser of searched gene - look at genes, NCBI menu (not ensemble) - select the PURPLE mRNA - can see length of mRNA (select NM option in genbank view), accession number, version number locus has NM: can click pubmed and full text seq verio - diff variants: look down to comment information and find transcript variant information there - all codons merged together! to give codons for entire gene, CDS says 195-5645, so everything up to 195 is the untranslated region - in origin section of features: only go up to length of JUST the mRNA (written in DNA language) - can click graphics page: shows you JUST where EXONS are

Open up the file "EngD cellulase" (from Canvas/Files/Sequences) in ApE. In what reading frame is the major (biggest) ORF? Approximately which base does this ORF start at?

Reading Frame 1; 240

In the basic cloning procedure, what is the order of steps?

Restriction digest, ligation, transformation, plating, plasmid preps

can you ligate BAMHI end to a Sau3AI end

Sau31 (short blunt) bases have to be in BAMHI (sticky) bases S: GATC B: GATCA = yes hybrid: GATCA(next) S: GATCC B: GATCA = no B: 5' GATCC and S; 5' GATC can ligate bc same bases in both, create a hybrid which would be: GATCC(nextSbase) if B: 5' GATCCA and S; 5' GATCA could not cut with BamHI anymore, bc we gained A could cut with S: GATC (bc sticky end is recognition site, and that is maintained in both)

Below are two restriction sites. The top one is "A" and the bottom is "B". You digest them and then ligate one to the other to form a hybrid restriction site. Which enzyme is always able to digest this hybrid restriction site? A|TGCAT TACGT|A |TGCA ACGT|

The enzyme which cuts B

ApE 1

a plasmid editor (free app) - tells you how many bases of RNA in "sequence" - changing number = what base you're on - tells you start, end, and length of base in base pairs - ORF menu + find next: shows you next ORF beginning with ATG - can name/edit features (ORF), pick color for it, scroll to colored spot: tells you name of gene ur in - put cursor 30 bases before end or at end and hit find next again, repeat - can see if sections are shared between ORF (overlapping) - new feature: name: replicase: color change

in RNaSeq menu

after looking up gene and clicking RNAseq transcripts, being on gene bank, click graphics

ApE - translating

always have to select something before you use translate - pick b/t 1 letter and 3 letter code - shows DNA complete translation - shows MOLECULAR WEIGHT (how many AA, position in genome)

middle of NCBI main page

analyze holds all tools needed for data analysis (analyze)

boutique: for every gene, right click

and click picture link (annexin is ca2+ binding protein localized in the cytoplasm) pubmed ids and journals - genomics info - protein - where produced (time), fucntion of protein, family, subcellular location and tissue-specificty - gene expression heatmap, becomes expressed after germination, in root not in shoot

A zinc finger comprises zinc... and what else?

beta sheet and alpha helix

The "Alpha + Beta" protein domain consists of:

beta strands and alpha helices strung separately from each other and beta strands are antiparallel

repeats are in

blue

format for chromosome, location, and spread

chr1: 1,500,000 / 200 200 = how wide of a spread of bases to look at when there

feature ruler

click a particular RNA or protein - tells you how many bases found in this exon 1-42, 43-X, Y-Z - protein = minus 5'UTR, 3'UTR missing (ends early)

RefSeq transcripts

click any one, find gene bank of mRNA you pick

copy reverse complement

command only found in ape

learn section of NCBI main page

conferences, recorded tutorials, manuals, documents, and FAQs

misc_features

could be promoter regions, miscellaneous recombination features

in "regulatory" menu of gene bank record

could see silencers: responsible for turning gene off could see enhancers: misplaced promoter could see promoter itself here: click gene id

ApE - editing

create PCR primer candidate, select bases that stick to a selected range - highlight + right-click BEGINNING BASES of top strand of an ORF for upstream primer (copy: paste in word to load into IDT as upstream primer) - highlight + right click same # LAST BASES of ORF, dont copy (COPY REVERSE COMPLEMENT for downstream primer) (length is shown up here, not pop up, new features is tab on top)

left panel of NCBI main page

different topics, can click to see available tools for a topic - proteins - literature

RefSeq transcripts + proteins caused by - click RefSeqGene to get to gene menu, same as genome browser

exon skipping which usually creates more transcript and protein variants than actual single gene variants

all find a gene info

expression info, bibliography, phenotypes, associated conditions, CNV, relationships with HIV, chemical pathways, interactions with other gene/proteins,, general gene info, gene ontology (types of binding), what types of break repair involves, cell structure connections

The beta-propeller structure consists of beta sheets with:

four consecutive beta strands, perpendicular to the radius of the propeller, in each "blade" so that each blade is a separate unit of sequence

if numbers are going down on scale, means

genome is being read backwards, so gene should be read in opposing direction - gene is on negative strand instead of positive strand of genome

vid

(ORF MAP, interpret them, 1,2,3-1,-2,-3, reading frame is 3 bc half lines indicate start sites and full lines indicate stop sites, how to translate 1 letter and 2 letter code - get the full peptide seq from DNA, number of amino acids, can copy reverse direction) wants to know you open IDT and APE

protein motif: helix-loop-helix

(also called leucine zipper) - 1 alpha helix - loop (long, lazy turn) - 1 alpha helix found in TF, 2 alpha helices bind to specific seq of DNA, other portion meant to stabilize - leucines stick together bc nonpolar, stability

dark green vs light green

(going R to L) exon: (tiny lines +) whole thing with dark green/white arrows translated: translational start site is right BEFORE 1st sark-green/white arrow box), going down in exons UTR: light green (usually big on 3' end - NOT) if gene variants underneath have light green boxes in diff places = exon skipping/splicing (then translational start site) arrows: always 5' 3', just r to l or l to r

intrinsically disordered proteins (IDP/IUP)

- 1 alpha helix in center (stays same) - JUST 2 ends with tracing over time (structures that were collected over time, structure of protein changes over time besides alpha)

protein motif: greek key

- 1 antiparallel beta sheet (2 come in interior, third comes all the way around, final one out) - said wont be tested on

protein motif: zinc finger

- 1 beta sheet w/2 antiparallel strands in it - 1 small alpha helix - 2 histine + cysteine ions which cover a stabilizing zinc molecule - motif found in TFs

protein motif: beta hairpin

- 1 parallel beta strand - 1 antiparallel beta strand - with tight hairpin (not loop or turn) holding them together

protein motif: helix-turn-helix

- 2 alpha helices separated by turns

small proteins

- 3 DS bonds - STD (sequential tri-disulfide bonds) making it extremely stable - cysteine-stabilized - often used in venoms or microbial peptides (need to be stable so they can stored for ready use)

protein motif: antiparallel beta sheet

- 3 antiparallel beta sheets (bc arrows go in opposite directions) (arrows point from n to c = polarity of protein)

beta protein domain - SH3 domain:

- 5 strand beta barrel

alpha/beta protein domains

- alternating parallel beta strands and alpha helices - strung together - TIM barrel from Triose Phopshate IsoMerase alpha-beta-alpha-beta-beta-alpha

beta protein domain - ten immunoglobulin domains: BCAM

- antigen-binding domain of antibody - alpha helices

beta protein domain - beta barrel:

- antiparallel beta sheet in 3D - porin allowing ions to go through

alpha + beta protein domains

- beta antiparallel strands, independent of alpha helices - alpha helices at beginning and end - all strung separately from eachother alpha-beta-beta-beta-alpha

alpha protein domains:

- bromodomain: 4 alpha helices - globin-fold domain: 8 alpha helices surrounding a heme group with oxygen in middle

select range of something

- click and drag across K bar, then right click for a menu of possible options

for NCBI

- dont use safari, use chrome - dont use ipad (mac or pc only)

what NEB has***

- double digest finder (space needed to cut 2 restriction enzymes) - enzyme finder: how compatible buffers would be (find by name, rec site, overhang seq) - biocalculator for ligation (mass to moles) - biffers, incubation - video tutorials - stuff you can order, guidelines, protocol

to go to seq level

- drag in menu so dark box of interest is between two gray middle arrows - select atm zoom to feature button - undo blue arrow in upper left to restore unzoomed view whenever

BRCA1 summary info

- gene encoding protein phosphorylated in nucleus, acts as tumor-suppressor, part of a large complex - BASC: BRCA-1 associated genome surveillance complex - associated w/RNA polymerase 2 and transcription, repair double stranded breaks and NHEJ - alternative splicing alters functions (genes can share promotor machinery)

genomes and maps on left panel - genome - human genome

- how to find gene if you know the genome/species of an organism it belongs to and location - 1st: pick a chromosome (zoom out) - type location, gene name, or phenotype in search assembly, following format * chr#:(position#) / (spread#)

integrated DNA technologies

- oligoanalyzer tool - need an IDT account - type a primer seq and hit self-dimer for BOTH primers (bc 2 for PCR) - also test both upstream and downstream primer to eachother (by inserting secondary seq) and hit hetero-dimer *** 2 rules for seeing if primer is rejected (primer-dimer, primer that is complementary to itself, and can do PCR on itself) - 3' most base is base-paired (paired to anything else) - delta G is LESS (more neg) than -3, if so, means more stable if meets both, primer is stable enough to cause us problem *bind to itself*, have to reject it in PCR - solution: adding a base to the 3' end to extend primer (for both self and dimer) - $15- for 1kb DNA

protein structure repositories

- protein data bank (PDB), swissprot, PIR, PRF - molecule modeling database (MMDB) provides access to PDB in NCBI


Conjuntos de estudio relacionados

nutrition and wellness chapter 8 and 9

View Set

Life-span Developmental Psych Final (Santrock)

View Set

Intro to Psychology Brains Unit Question

View Set

215 life health and annuities exam

View Set

Baroque Terms formid term fall 2018

View Set

Bio 1A Ch 2 The chemical foundation of life

View Set

Subject/predicate, simple subjects, simple predicates

View Set