Internal structure similarity in proteins can be observed at the domain and subdomain levels. From an evolutionary perspective, structurally similar elements may arise divergently by gene duplication and fusion events but may also be the product of convergent evolution under physicochemical constraints. The characterization of proteins that contain repeated structural elements has implications for many fields of protein science including protein domain evolution, structure classification, structure prediction, and protein engineering. FiRES (Find Repeated Elements in Structure) is an algorithm that relies on a topology-independent structure alignment method to identify repeating elements in protein structure. FiRES was tested against two hand curated databases of protein repeats: MALIDUP, for very divergent duplicated domains; and RepeatsDB for short tandem repeats. The performance of FiRES was compared to that of lalign, RADAR, HHrepID, CE-symm, ReUPred, and Swelfe. FiRES was the method that most accurately detected proteins either with duplicated domains (accuracy = 0.86) or with multiple repeated units (accuracy = 0.92). FiRES is a new methodology for the discovery of proteins containing structurally similar elements. The FiRES web server is publicly available at http://fires.ifc.unam.mx. The scripts, results, and benchmarks from this study can be downloaded from https://github.com/Claualvarez/fires.
Última actualización: 14/09/2020