Genome annotation

Ace your homework & exams now with Quizwiz!

differences in annotation

- ilusltrates differences in annotation btw two strings - bc there is no annotation using pipeline (software tool that consist of individual components) -they would annotates genes differently with different names -huge problem -leads to a lot confusion

Bacterial genome annotation?

-Bacterial genomes vary in size from 1Mb to 8Mb -Not practical to annotate them by hand -Use an automatic pipeline to transfer information from closely related reference genome -Then, (try to) manually curate the results -Manual curation catches and removes erros

What is genome annotation?

-after a genome is sequenced, it is just a string of bases -we need to assign meaning to the genome -we need to annotate it -structural annotation: identify genes and intro-exon structures -functional annotation: attach data the says what the gene does and what function it's involved in.

steps of virus genome annotation

-break genome into 1000bp fragments -BLATX fragments against GenBank -Find out the proteins that these regions encode -Find the hypothetical proteins by using gene finding program -hypothetical protein: a protein that we don't know the function of. (reason why we want to annotated)

virus genome annotation:

-virus genomes are not large -therefore, we can annotate them manually -for an adenovirus genome of 35kb, the process can take 2-4 weeks -consist of several steps

software tools: Ab initio and evidence-drivable gene predicters

Genemark - a self training gene finder GenomeScan - extension of the popular Genscan algorithm

Eukaryotic genome annotation (how to predict)

Ab-initio gene prediction • Gene predictors became available in the 1990s and revolutionized genome analyses • Need no external evidence to identify a gene or to determine intron-exon structure • Most do not report untranslated regions or alternatively spliced transcripts

software tools: EST, protein and RNA-seq aligners and assemblers

BLAST - Basic local Aligment search tool BLAT - faster than BLAST but has fewer features

Eukaryotic genome annotation

Eukaryotic genome annotation • Ultimate goal is to obtain a synthesis of alignment based evidence with ab-initio prediction to obtain a final gene annotation set • Human curation too time consuming and too expensive • Run different gene finders on the genome and choose the best prediction

What are the steps bacterial Genome annotation

Fasta sequence (format is a text-based format for representing either nucleotide sequences or peptide sequences; uses single-letter codes) predict genes ( process of identifying genes) compare reference: Genome/Uniprot (for each gene you compare to your closed related reference geno, using BLAST) is there a homologue? (determines the closer related protein, if its present or not) Take reference annotation (if homologue is present you reference annotation is used) / label as hypothetical protein predict domains (used when homologue is not present, Run then through Pfam and Prosite. To assign some function to these proteins; and domains will give some clue as to the function) add annotation (assign meaning to the genome) predict other features (e.g. tRNA)

software tools: choosers and combiners

JIGSAW GLEAN

software tools: Genome annotation pipelines and why do we want to use a genome browser?

PASA NCBI -they can do a lot things -to find a list of genes in given genomic region.

Artemis:

is a genome browsing tool and also annotation tool -GenBank file Fasta sequence is: pure sequence, title and > (greater than symble)

Eukaryotic genome annotation

• Use annotation pipelines since these genomes are large • Mask genome repeats - replace repeats with Ns. Failure to mask repeats will lead to millions of spurious BLAST hits, which will provide false evidence for genome annotation • Next, use evidence based annotation • Evidence based annotation aligns ESTs (give evidence of gene expression) and proteins to the genome using BLAST


Related study sets

Microsoft Windows Server 2019 Lab 1-1 Install & Configure Server Core

View Set

Psych Concept Practice Chapter 6-8

View Set

Ocean Basins and Continental Margins (Chapter 4)

View Set

Chapter 5, 6, and 7 study guide 10/23/2017

View Set

DRI, RDA, AI, UL, EAR, DV, and AMDR and their uses.

View Set

PC of Pediatrics: Quiz #2- Neuro (Part 1: Seizures)

View Set

ACCT207 Ch. 1-4 Study Guide Questions

View Set

N123 PrepU Ch. 47: Management of Patients With Intestinal and Rectal Disorders - ML6

View Set