In English

Benchmarking of read mapping bias in allele specific expression analysis

Alva Rani James
Göteborg : Chalmers tekniska högskola, 2013. 38 s.
[Examensarbete på avancerad nivå]

Most genes in diploid organisms have two “copies”; one copy inherited from each parent. If an individual has two different alleles (code variants) at a specific gene locus, then the individual is heterozygous at that locus. Allele specific expression (ASE) can be explained as the differential expression between the two different alleles of a gene in a single individual. There are several mechanisms that can cause ASE, e. g, it can be caused by a heterozygous variant in the promoter region, causing a difference in transcription factor binding affinity between the maternal and paternal allele. Accurate measurement and identification of ASE can be obtained by precise mapping of reads, generated from RNA next generation sequencing (RNA-seq), towards the reference genome of the organism. Mapping bias is a major technical hurdle in ASE studies which arises when we map short RNA-seq reads towards a reference genome. This arises mainly when the reads which carries non-reference alleles is not matching towards the reference genome gives out a lower mapping quality. In this thesis we investigated two proposed methods to reduce mapping bias: a read mapping program called GSNAP, and masking the reference genome with respect to single nucleotide variants. Masking the reference genome removed the mapping bias to a greater degree than GSNAP; however, the masking caused a considerable drop in read coverage. In conclusion, none of the two methods reduced the mapping bias satisfactorily, highlighting the importance to develop new or modified methods for mapping bias reduction.

Publikationen registrerades 2013-06-24. Den ändrades senast 2013-06-24

CPL ID: 179084

Detta är en tjänst från Chalmers bibliotek