Remember meForgot password?
    Log in with Twitter

article imageEssential Science: Developing ML to see protein patterns

By Tim Sandle     Jan 20, 2020 in Science
A new machine learning system has been used to characterize 800 million-year-old amino acid patterns that had, up until now, puzzled scientists. These protein patterns are of great importance and they are responsible for facilitating protein interactions.
The patterns of interest are leucine-aspartic acid motifs. These are short amino acid sequences found within some proteins designed to connect them to the cellular molecules which control cell adhesion, motility, and survival.
LD motifs and their role in disease
The proteins are involved with embryogenesis (the process by which the embryo develops from the fertilised egg cell), wound healing and the evolution of multicellularity (as characterized by advanced lifeforms).
The cell research team nurture billions of cells to keep each one uncontaminated and thriving.
The cell research team nurture billions of cells to keep each one uncontaminated and thriving.
However, leucine-aspartic acid motifs are also known to function in cancer cell spreading plus in relation to cardiovascular and infectious diseases. Hence, a greater understanding of the patterns could lead to new medical advancements.
Developing knowledge about leucine-aspartic acid motifs began with their discovery in 1996, although only four such proteins have been discovered to date, according to Laboratory Manager magazine. Accurate prediction of the proteins has proven to be difficult due to their shortness and sequence degeneracy.
Brain inflammation is present in brains of autistic patients. — Neurons often have extensive netwo...
Brain inflammation is present in brains of autistic patients. — Neurons often have extensive networks of dendrites, which receive synaptic connections. Shown is a pyramidal neuron from the hippocampus, stained for green fluorescent protein.
Wei-Chung Allen Lee, Hayden Huang, Guoping Feng, Joshua R. Sanes, Emery N. Brown, Peter T. So, Elly
Machine learning and new developments
The new understanding about the protein structures comes from King Abdullah University of Science & Technology and it involved the application of machine learning. The developed tool has been named the LD Motif Finder (LDMF).
Machine learning is concerned with the creation and evaluation of algorithms that facilitate pattern recognition, classification, and prediction, based on models derived from existing data.
Arm Project Trillium
The algorithm scans through the human proteome and identify leucine-aspartic acid motif patterns. The initial design proved to be highly-complex, given the very few number of available patterns available to train the platform.
The accuracy of the algorithm was enhanced by including experimental testing of earlier predictions and teaching the system to learn from the results.
By applying the machine learning, the science group were successful in identifying twelve new human proteins that carry functional leucine-aspartic acid motifs.
Commenting on this lead researcher Stefan Arold states: "This gives us a good idea of how many of these motifs exist within the human proteome. It seems there are far fewer than researchers initially suggested. Of course, this does not mean that they are biologically irrelevant."
The key finding was that proteins containing leucine-aspartic acid motifs possess functions connected to cell adhesion and morphogenesis. This infers that leucine-aspartic acid motifs significantly define the cellular roles of proteins.
The machine learning tool was also applied for the examination of the genomes of mammals, birds, fish, worms, insects and microorganisms, in order to track down leucine-aspartic acid motifs. The broader analysis enabled the researchers to conclude that leucine-aspartic acid motif signaling evolved some 800 million years ago in unicellular organisms.
Other applications of machine learning
Machine learning is being used to advance biological research in other ways. For example, one research group at MIT used an algorithm to predict how human cells respond to breaks in DNA. This led to the finding that cells generally repair broken genes in ways that are precise and predictable. This finding could help to treat some forms of rare genetic diseases.
Stem cells are primitive cells that  as they grow  become differentiated into the various specialise...
Stem cells are primitive cells that, as they grow, become differentiated into the various specialised cells that make up the the brain, the heart, kidneys and other organs
Mauricio Lima, AFP/File
With medicine, the application of machine learning has extended from preliminary (early-stage) drug discovery to the initial screening of drug compounds and with predicted success rates of various biological factors.
Such examples suggest that the big paradigm shift of machine learning will, in time, be adopted in all areas of biology and medicine.
Research paper
The research has been published in the journal Bioinfomatics, with the paper titled “Proteome-level assessment of origin, prevalence and function of leucine-aspartic acid (LD) motifs.”
Essential Science
This article is part of Digital Journal's regular Essential Science columns. Each week Tim Sandle explores a topical and important scientific issue.
An ESA/Hubble artist's impression of the K2-18b super-Earth  the only super-Earth exoplanet kno...
An ESA/Hubble artist's impression of the K2-18b super-Earth, the only super-Earth exoplanet known to host both water and temperatures that could support life
Last week the subject was exoplanets. This followed news that NASA has reported it has detected an Earth-like planet that has all the indications of being habitable. This forms part of the space agency’s attempt to seek out new planets of interest in the cosmos.
The week before we looked at why earlier attempts to develop a vaccine against the bacterium Staphylococcus aureus have failed. The research indicates that a new approach for vaccine design is required. This approach is one where an untapped set of immune cells need to be activated.
More about Protein, machine learning, Disease, Biology
Latest News
Top News