Computational insights into alternative splicing driven proteome diversification: An evolutionary perspective .
Loading...
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
IISER Mohali
Abstract
Alternative Splicing (AS) generates transcriptome diversity in eukaryotes through variably spliced mRNA transcripts.
Their translation contributes to proteome diversity and expands the functional repertoire of genes. The expansion of
proteome through AS could help decipher the perplexing observation of variable organismal complexity in eukaryotes
despite having similar gene counts. While significant experimental and computational efforts have enhanced our
understanding of transcriptome diversity, there have been limited studies on the contribution of AS to proteome
expansion. The splicing information documented in databases is not amenable to systematic investigations of AS
impact on the generation of protein sequence/function variation and their comparison across eukaryotes. In my thesis,
I have devised an innovative approach to uniquely annotate exons that facilitated comparative analyses of proteome
variation generated by various AS events (exon skipping, mutually exclusive exons, alternate splice sites, and intron
retention).
I developed a standardized framework system, Exon Nomenclature and Classification of Transcripts (ENACT), to
identify and classify exons. The ENACT framework annotates exons for their coding property, occurrence, and splice
site variation attributes, followed by mapping of protein sequence along with predicted features. We documented exon
annotations of five representative genomes, Caenorhabditis elegans, Drosophila melanogaster, Danio rerio, Mus
musculus, and Homo sapiens, in the ENACTdb database. Using ENACT, we identified an emerging trend of ‘dual’
exons, which are non-coding in some transcripts and coding in others. Both dual and untranslated region (UTR) exons
are prevalent in higher eukaryotes having a significant increase in their alternate sub-type. Notably, the ratio of
alternate to constitutive UTR exons is higher than coding exons, indicating an increased extent of Alternate
Transcription (AT) in conjunction with AS. We performed detailed analyses of AT/AS processes in the human genome
and examined the distribution of protein domains in constitutive and alternate exons. Using the first and last
constitutive exons for defining core and AT region on protein sequence, I observed former has more domain fraction
per gene and varies least among isoforms. In contrast, AT region constitutes a third of the coding protein; yet it
contains more intact domains than core region, suggesting its significant role in introducing variability of protein
domains and likely contribution to organismal complexity.
Apart from analyzing alternative splicing, I have performed modeling and simulation studies of β-sheet nanocrystal
regions responsible for the ultimate tensile strength of spider silk protein. Through multiple SMD pull studies on
modeled nanocrystals, we found that naturally occurring sequence of silk achieve superior mechanical strength by
optimizing side-chain interaction, packing, and main-chain hydrogen bond interactions. In another study, I investigated
the role of Y321 in oligomerization of Vibrio Cholera Cytolysin toxin. MD and network analyses showed that Y321A
mutation leads to a drastic change in network communities, suggesting a possible loss of coordinated motion required
during oligomerization.