The patterns of interest are leucine-aspartic acid motifs. These are short amino acid sequences found within some proteins designed to connect them to the cellular molecules which control cell adhesion, motility, and survival.
LD motifs and their role in disease
The proteins are involved with embryogenesis (the process by which the embryo develops from the fertilised egg cell), wound healing and the evolution of multicellularity (as characterized by advanced lifeforms).
However, leucine-aspartic acid motifs are also known to function in cancer cell spreading plus in relation to cardiovascular and infectious diseases. Hence, a greater understanding of the patterns could lead to new medical advancements.
Developing knowledge about leucine-aspartic acid motifs began with their discovery in 1996, although only four such proteins have been discovered to date, according to Laboratory Manager magazine. Accurate prediction of the proteins has proven to be difficult due to their shortness and sequence degeneracy.
Machine learning and new developments
The new understanding about the protein structures comes from King Abdullah University of Science & Technology and it involved the application of machine learning. The developed tool has been named the LD Motif Finder (LDMF).
Machine learning is concerned with the creation and evaluation of algorithms that facilitate pattern recognition, classification, and prediction, based on models derived from existing data.
The algorithm scans through the human proteome and identify leucine-aspartic acid motif patterns. The initial design proved to be highly-complex, given the very few number of available patterns available to train the platform.
The accuracy of the algorithm was enhanced by including experimental testing of earlier predictions and teaching the system to learn from the results.
By applying the machine learning, the science group were successful in identifying twelve new human proteins that carry functional leucine-aspartic acid motifs.
Commenting on this lead researcher Stefan Arold states: “This gives us a good idea of how many of these motifs exist within the human proteome. It seems there are far fewer than researchers initially suggested. Of course, this does not mean that they are biologically irrelevant.”
The key finding was that proteins containing leucine-aspartic acid motifs possess functions connected to cell adhesion and morphogenesis. This infers that leucine-aspartic acid motifs significantly define the cellular roles of proteins.
The machine learning tool was also applied for the examination of the genomes of mammals, birds, fish, worms, insects and microorganisms, in order to track down leucine-aspartic acid motifs. The broader analysis enabled the researchers to conclude that leucine-aspartic acid motif signaling evolved some 800 million years ago in unicellular organisms.
Other applications of machine learning
Machine learning is being used to advance biological research in other ways. For example, one research group at MIT used an algorithm to predict how human cells respond to breaks in DNA. This led to the finding that cells generally repair broken genes in ways that are precise and predictable. This finding could help to treat some forms of rare genetic diseases.
With medicine, the application of machine learning has extended from preliminary (early-stage) drug discovery to the initial screening of drug compounds and with predicted success rates of various biological factors.
Such examples suggest that the big paradigm shift of machine learning will, in time, be adopted in all areas of biology and medicine.
Research paper
The research has been published in the journal Bioinfomatics, with the paper titled “Proteome-level assessment of origin, prevalence and function of leucine-aspartic acid (LD) motifs.”
Essential Science
This article is part of Digital Journal’s regular Essential Science columns. Each week Tim Sandle explores a topical and important scientific issue.
Last week the subject was exoplanets. This followed news that NASA has reported it has detected an Earth-like planet that has all the indications of being habitable. This forms part of the space agency’s attempt to seek out new planets of interest in the cosmos.
The week before we looked at why earlier attempts to develop a vaccine against the bacterium Staphylococcus aureus have failed. The research indicates that a new approach for vaccine design is required. This approach is one where an untapped set of immune cells need to be activated.