Short, Linear Motifs (SLiMs) are an elegant and spatially efficient solution for encoding interaction interfaces. Consequently, SLiMs mediate a diverse set of regulatory functions such as directing ligand binding, providing both peripheral docking sites and specificity for modifying kinases, controlling protein stability by recruiting ubiquitin ligases and acting as signals to target proteins to specific subcellular locations. Peptide motifs encode a functional interface in a compact module that is structurally and functionally autonomous and can evolve de novo in a polypeptide without deleteriously affecting the structural and functional integrity of the rest of the protein. This modularity, in combination with their regulatory potential and their easy de novo generation, provides a simple evolutionary mechanism for generating proteins subjected to extensive regulation by cellular pathways involved in protein homeostasis. Consequently, most regulatory motifs are present in multiple unrelated proteins and peptide motifs influence the life of almost all proteins from their synthesis to their destruction, regulating and coordinating their processing, localization and degradation.
Approximately one third of the human proteome consists of intrinsically disordered regions (IDRs). However, the functional role of the vast majority of these regions is unknown. The most common functional modules within IDRs are the compact, linear protein interaction sites known as short linear motifs (SLiMs). The human proteome has been estimated to contain more than a hundred thousand – and possibly up to a million - SLiM instances. This would make them the most numerous protein modules in the cell. Yet, to date, only few thousand instances are known. The goal of this objective is the development of an in silico framework for the discovery of novel SLiM instances. We develop two SLiM discovery tools, SLiMSearch and SLiMPrints. SLiMSearch screens whole proteomes for novel instances of known motifs incorporating ancillary evolutionary, proteomic and genomic data to rank putative motifs instances by confidence of functionality. SLiMPrints discovers novel motifs by searching for groupings of residues that are under greater functional constraint than their surrounding residues, a strong functional discriminator for motifs. Though this objective builds on many years of development, inevitably, a number of areas remain where improvements can be made to the individual components. These areas include: (i) the underlying statistics of motif occurrence, (ii) the functional annotation of motifs through keyword enrichment, (iii) the incorporation of regulatory information into motif discovery tools as contextual information to prioritise hits for further validation, and (iv) the construction of high-quality alignments for motif conservation analyses.
Common discriminatory attributes for SLiM functionality including conservation (SLiMPrints), accessibility (the IUPred tool and structural information from PDB) and propensity to fold upon binding (the ANCHOR tool) can be calculated to enrich mass-spec, phage-display or in silico screen. Experimental, proteomic and genomic data can also be integrated including known motifs (ELM resource, switches.ELM and SLiMDB), modification sites (PhosphoSite and Phospho.ELM resources), protein isoform- specific expression of the containing exon (Ensembl), single nucleotide polymorphisms (dbSNPs), structural information (PDB) and mutagenesis information (UniProt). This information can be used to flag unusual hits, check the fidelity of the results and to select the best hits for further experimentation. Finally, GO term and keyword analysis tools, can predict a potential functional role of the discovered domain binding peptides, and the SLiMFinder tool, can calculate any consensus in the returned motif peptides.
The intrinsic attributes of SLiMs predispose them to both pre- and post-translational regulation. The relatively weak affinity of SLiM-containing interfaces can be easily modulated by post-translational modification (PTM) and, consequently, modification of a residue in or adjacent to a linear motif is a common mechanism to conditionally and dynamically regulate SLiM-mediated interactions. The compact footprint of SLiMs facilitates the occurrence of regions with high functional density containing multiple adjacent or overlapping motifs. These multi-module interfaces can act cooperatively or competitively in a highly controlled manner to conditionally build functionally distinct complexes depending on the local abundance of the binding partners. Finally, the inclusion or exclusion of a SLiM containing exon by pre- translational mechanisms such as alternative splicing, alternative promoter usage and RNA editing can rewire the interaction network of a protein isoform in a cell state- or tissue-specific manner, thereby altering its sub-cellular localization, half-life, binding partners, activity or modification state, and hence its function. To date, a substantial number of conditional SLiM-mediated interactions have been biochemically characterized, highlighting the central role that pre and post-transcriptional modulation of SLiMs plays in the regulation of dynamic cellular processes.