There has been an increasing interest in single amino acid repeats ever since it was shown that these are the cause of a variety of diseases. Subsequently, such repeats have been implicated in a number of roles, including the facilitation of adaptive processes and in protein interaction networks.
In generic surveys, leucine repeats, which can be toxic, surprisingly were among the most frequent. We could show that repeats of leucine – but not of other hydrophobic amino acids – are over-represented in signal peptides that are cleaved off the mature protein. This trend is most pronounced in higher eukaryotes, particularly in mammals. In the human proteome, although less than one-fifth of all proteins have a signal peptide, approximately two-thirds of all leucine repeats are located in these transient regions. The substantial fraction of proteins affected by the strong enrichment of repeats in these transient segments also highlights the bias that they can introduce for systematic analyses of protein sequences.
Moreover, in contrast to a general lack of conservation of repeats, these leucine repeats were found to be more conserved than the remaining signal peptide regions, indicating that they may have an as yet unknown functional role.