No idea if this is the right forum for this question, but I'll give it a shot.
I would maybe like to study further the mouse gene 1700007K13Rik. The resulting protein should have the following sequence:
Highlighted in bold is a sequence containing a lot of E and P and there seems to be some repetitive sequences as well. Does anybody have an idea what such a domain may do just based on the sequence? I myself have absolutely no idea what the consequence may be for a protein to have a lot of prolines and glutamic acids in a small region.
I looked for similar known domains, but no result.
Hi Basically proline is called a helix breaker (poor helix forming ability) due to Pyranose ring structure. Hence it is unable to change or adapt a new conformation. In most cases,proline is found at the end of an alpha-helix. Many proline rich sequence have XP or XPY, which means they are seen only where the protein has twists and turns (found on the surface) as its hydrophobic and aliphatic. Thus making it easily for the proteins to be recognized for intracellular signalling. Next to proline ur gene has either Glutamic acid (E) or Lysine (K).A helix has an overall dipole moment caused by the aggregate effect of all the individual dipoles from the carboxyl groups of the peptide bond pointing along the helix axis. This can lead to destabilization of the helix through entropic effects. As a result, α helices are often capped at the N-terminal end by a negatively charged amino acid, such as glutamic acid in order to neutralize this helix dipole. Less common (and less effective) is C-terminal capping with a positively charged amino acid, such as lysine. Hope you find it useful
You could also look for PXXP or "poly-proline domain and p53", which might give you some hints.