MBIOS 301 Exam 4 Study Guide

Ace your homework & exams now with Quizwiz!

functional protein microarrays

- consists of many different cellular proteins - used to probe the function of proteins note: requires HUGE investment of antibody development, and is mainly used by drug companies for drug discovery.

what is the goal to find the DNA sequence from the protein sequence?

- determine the possible codon sequences - used as a query sequence in DNA databases search - gene sequence is used to predict the amino acid sequence of the entire protein.

antibiotic microarray

-collection of antibodies that recognize short peptides -used to assess the level off protein expression

shotgun sequencing

-easier and faster method -does not require extensive physical map -clones from a genomic or chromosomal library isolated randomly and sequenced multiple times. -overlapping sequencing matched together using computers.

post-translational modification

-irreversible changes may be necessary to produce a functional protein ex: proteolytic processing, disulfide bonds, attachment of prosthetic groups, sugars and lipids -reversible changes that transiently affect protein function. ex: phosphorylation, methylation, and acetylation

alternative splicing

-most important alteration that occurs in eukaryotes -single pre-mRNA is spliced into more than one version -often cell specific or related to environment conditions

RNA editing

-much less common that alternative splicing -leads to changes in the coding sequence of mRNA

why are protein microarrays more challenging?

-proteins can more easily be damaged during microarray fabrication (sensitive) -synthesis and purification of proteins are more difficult.

mass spectrometry (determining amino acid sequence of a protein).

-separates peptides based on size and charge -peptide mass correlates with amino acid composition -amino acid composition can be searched against a database to determine gene identity

what are the two ways cotigs form?

1. assembled piece of DNA sequence built from short sequencing reads (most common). 2. A set of overlapping DNA clones that represent a region of chromosome. note: chromosomal DNA is partially digested and fragments are cloned to create genomic DNA library. CRITICAL FACTOR!

the proteins a cell produces is dependent on:

1. cell type 2. stage of development 3. environmental conditions

what are characteristics of prokaryotes?

1. circular genome 2. <5Mb 3. 500-5,000 genes 4. "compact" genomes - gene density ~1/kb exception: linear chromosomes

what are the three ways programs used in bioinformatics to analyze sequences?

1. locate specialized sequences (sequence element or motif). 2. locate predefined sequences and identify specific types of sequence organization or sequence elements. 3. locate a pattern of symbols

what are the three main phases of genomic analysis?

1. mapping-historically came first 2. sequencing-more popular 3. functional genomics

Goals of the Human Genome Project

1. obtain a genetic linkage map of the human genome 2. obtain a physical map of the human genome 3. obtain the DNA sequence of the entire human genome. 4. develop technology for the management of human genome information. 5. Analyze the genomes of other model organisms 6. develop programs focused on understanding and addressing the ethical, legal, and social implications from the results of the human genome project. 7. develop technological advances in genetic methodologies.

mass spectrometry steps:

1. protein purified from gel slice 2. protein fragmented by enzymes 3. protein fragments subjected to mass spectrometry 4. fragments are then fragmented again 5. mass spectrometry performed again

what are the steps of ChIP assays?

1. proteins cross linked do DNA 2. DNA isolated from cells and broken into small pieces 3. antibody used to precipitate the protein DNA complex. 4. DNA is purified and amplified with PCR 5. sequence of the amplified DNA can be identified by using it as a probe on microarray. Note: most useful for transcription factors

BLAST

Basic Local Alignment Search Tool: used to search databases to find alignment between newly sequenced genome and genes that have already been identified in the same or different species. -identify homologous genes that are evolutionarily related in other organisms -using these searches it can help assign putative function to a sequence of interest.

comparative genomics

Compares genomes of different organisms to answer questions about genetics and other aspects of biology -incorporates the study of gene and genomic evolution. -explores the relationship between organisms and the environment -studies differences and similarities between organisms and its contributions to phenotype, and life cycles.

molecular markers

DNA sequences that do not encode genes but can be mapped. Molecular markers also have to be POLYMORPHIC

Why is the proteome larger than the genome?

Due to alternative RNA splicing, post-translational modification, and RNA editing.

Next generation DNA sequencing

Ion torrent/Proton SMART sequencing

computer program

a defined series of operations that can analyze data in a desired way.

Ion torrent/Proton

addition of a single base at a time and monitors release of protons and is much cheaper and no need to separate reaction products.

transcriptomes

after transcriptional regulation -in theory show all transcribed genes - usually illumin methodology used, extract mRNA, make into cDNA and sequence all of it

repetive sequences

are also widespread in eukaryotic genomes (transposons) -the largest factor in the difference in genome sizes

single nucleotide polymorphisms (SNPs)

associated with disease conditions (sickle cell or cystic fibrosis).

basic research

characterization of genes and genomes

homologs

closely related genes (high DNA sequence similarity).

gene knockout collections

collections of knockout strains for every gene ex: NIH knockout mouse project

evolution

comparative genomics

Encyclopedia of DNA elements (ENCODE)

created with the aim of using experimental approaches and bioinformatics to identify and analyze functional elements that regulate expression of human genes.

RNAseq

current standard for demonstrating reproducibility=sequence a minimum of 3 biological replicates-libraries made from 3 identical RNA extractions. -also reveals gene expression in cells and tissues

Algorithm-based software programs

developed for creating DNA sequence alignments (lined up and compared).

agriculture

development of improved traits

SMART sequencing

different fluorochromes onto each singular base (all at once=much quicker). fluorochromes are attached to a phosphate that is cleaved off the ddNTPs and ends up a pyrophosphate product. Works in REAL TIME.

pattern recognition

does NOT rely on specialized sequence info. identifies a pattern of symbols than can occur within any group of symbol arrangements.

Illume sequencing

each base is fluorescently labeled, all have proprietary removable terminators. a. one base added at a time b. imaged to see which base is added c. terminator is removed d. repeat

proteome

entire collection of a species' proteins

proteomics

examines the functional roles of the proteins that a species can make.

e-value

expected value: statistical analysis comparing result with random chance - represents the number of times that a match or a better one would be expected to occur by random chance. note: the lower the e-value the more significant the match (0=identical match).

what can DNA Microarrays be used for?

finding genetic variations cell specific gene expression gene regulation tumor profiling microbal strain identification

structural genomics

focuses on sequencing genomes and analyzing them to identify genes and important sequences such as regulatory elements.

important note:

for many model organisms you can order specific gene "knockouts" and study the effects of losing those genes on phenotype.

bioinformatics gives insights into:

gene structure gene function relationships between genes and organisms protein function protein interactions predicting drug structure and function

what is the first step in computer analysis of genetic data?

generation of a computer data file (collection of information in a form suitable for storage and manipulation). files are usually annotated and stored in a database.

genomics

genes and other nucleic acid sequences

orthologs

genes at the same locus in different species inherited from a common ancestor.

transcriptomics

genes expressed in a given sample

important note:

genome sequences have opened new possibilities for analyzing transcriptomes and proteomes.

functional motifs

helix-turn-helix, leucine zipper, or zinc finger motifds

physical mapping information:

historically involved making libraries of chromosomal DNA, very time and labor extensive. ultimate goal: obtain a complete contig for each type of chromosome.

proteomics

how proteins interact

medicine

identification of genetic bases of disease

important note:

identification of open reading frames requires translation of all 6 reading frames.

Qualitatively

identifying which genes are expressed and which are not in a given sample.

homology

implies a common ancestry (similarity is due to homology).

gene density

in eukaryotes is very low compared to bacteria and varies even between chromosomes in a species -large variation in introns size and number between and within genomes

DNA microarrays (DNA chips)

labelled cDNA binding to a spot is detected by fluorescence (analysis of genes simultaneously). -Microarrays are being replaced by next generation DNA sequencing of cDNA.

important note:

mass spectrometry can also be used to identify protein covalent modifications.

similarity

means sequence similarity

quantitatively

measuring varying levels of expression of genes

important note:

new technologies in higher speeds, greater accuracy and reduced costs of sequencing-extensions of HGP.

blastn

nucleotide vs. nucleotide databases

Genbank

one of the most important genomic databases, Genbank annotates the following: genes, their regulatory sequences, their functions.

gene knockout

organism that has a specific gene that has been inactivated. -bacterial knockouts are created using transposons -mammalian knockouts created using homologous recombination. - plant knockouts are usually t-DNA insertions (transformed with a piece of DNA that inserts randomly into the genome).

contigs

overlapping fragments collectively from one continuous DNA molecule within a chromosome.

sequence recognition

program identifies specific sequences

blastp

protein vs protein database

tblastn

protein vs translated nucleotide

restriction enzymes

recognize specific (usually 6 bases long) sequences and cleave the DNA at those sequences.

paralogs

related genes within or between species (from duplication events in one genome).

physical mapping

relies on DNA cloning techniques - genes are mapped relative to each other (base pairs-measurement). - identifying overlapping clones across whole chromosome to create CONTIGS.

linkage mapping

relies on genetic crosses - genes are mapped relative to each other (map units-measurement).

cytogenetic mapping

relies on microscopy - genes are mapped relative to visible band locations.

copy number variants (CNVs)

segments of DNA that are duplicated or deleted.

pyrosequencing

sequencing by synthesis among others (always changing).

DNA microarrays

show the mRNA expression of thousands of genes simultaneously

polymorphoc

show variation between individuals in a population.

protein microarrays

similar concept to DNA microarrays and are used to study: - protein expression - protein function - pharmacology

bioinformatics

software to analyze the huge amounts of data goal: extract information from genetic sequences with a mathematical/computational approach.

what is the purpose of the BLAST searches?

starts with a sequence and then located homologous sequences in a large database.

open reading frames (ORFs)

stretches of nucleotides that when translated to protein generate a series of amino acids prior to a stop codon, suggestive to a protein-encoding gene.

Transcriptome analysis (global analysis of gene expression)

studies expression of genes qualitatively and quantitatively.

functional genomics

study of gene function based on the RNAs or possible proteins they encode as well as regulatory elements. - more typically now RNAseq is used to generate transcriptomes. - uses Nextgen sequencing to sequence cDNA made from specific tissue samples.

proteomics

study of the proteome (proteins produced) and how they interact with each other.

eukaryotic genomes

the basic features are similar in different species, genome size in eukaryotes is highly variable, but the number of actual genes is fairly consistent.

genome

the complete set of DNA in a single cell of an organism

functional genomics

the goal of functional genomics is to elucidate the roles of genetic sequences in a given species.

After denaturation...

the proteins are separated by molecular weight and identified by mass spectrometry.

genomics

the study of genomees

similarity score (identity value)

the sum of identical matches/total number of bases or amino acids aligned.

if proteins are extracted under native conditions

they are separated by their isoelectric point (no net charge)

what is the aim of gene annotation?

to identify and label important structural features of genes (known or unknown). Such as: - regulatory elements in promoters and enhancers - exons and introns (splice sites). - translation start and stop sites -polyadenylation site

blastx

translated nucleotide vs protein database

tblastx

translated nucleotide vs translated nucleotide

what is used to separate proteins?

two-dimensional gel electrophoresis

chromatin immunoprecipitation (ChIP) assays

used to determine if proteins can bind to a particular region of DNA.

genome/transcriptome annotation

using BLAST searches and motif databases to assign putative functions to all expressed genes (BLAST2GO).

most widely used method:

whole genome shotgun sequencing (decreases cost and increase of computer power).


Related study sets

BUAD 281 Final Conceptual Questions

View Set

MAR2023 Exam 2 Practice Questions

View Set

driving- lesson 13: backing up safely

View Set

Chapter 1: American Government and Civic Engagement

View Set

Ch.1 STUDY Guide "Business Ownership & Registration"

View Set

biology test 5, page 5 study material

View Set