Artificial intelligence (A.I.) is the ability of computer systems to perform complex tasks that would usually require human interaction. A.I. was once the stuff of science fiction but recent technological advances have meant that many people now interact with A.I. on a daily basis, for instance when using phone personal assistants or home smart assistants. As well as making our everyday lives easier, the ability of A.I. to handle complex and huge sets of data is now also changing the scientific landscape.
This blog highlights how A.I technology is being used to shape scientific discoveries and research direction and how Bio-Rad's antibody search engine harnesses these technologies.
In recent years, there has been an explosion in the amount of data that can be generated in experiments thanks to high-throughput analysis and automation. Previously, it would often take several grants (and many post-doc hours) to analyze a genetic screen or imaging data, to interpret subtle changes in genetic pathways.
Modern techniques have made the whole process a lot faster and consequently much more data is being created and it isn’t uncommon for data to be produced at a faster rate than it can be analyzed by researchers.
This is when A.I and machine learning (the ability to learn from data without human interaction) can help with data automation and analysis to accelerate scientific discoveries.
Modern genomics can generate deep datasets with a wealth of information (like metabolomic, genomic and proteomic data) but this can prove difficult for traditional statistical techniques to comprehend. A.I. and machine learning can help solve this problem by evaluating hundreds of heterogeneous datasets and gene-gene networks, spotting subtle patterns, and learning to distinguish ‘signal’ from ‘noise’. Researchers are left with a more manageable amount of data to sift through, and with more time to verify findings and shape their hypotheses.
Deep learning algorithms mimic the neural connections and learning behavior of the human brain, and can be used to examine raw data and develop predictive models to analyze the data, leading to identification of interactions of interest. In recent years, a number of papers have been published where academic researchers have demonstrated the feasibility and reliability of machine learning to predict important biological interactions.
This has been used for example in areas like the study of molecular disease mechanisms to help scientists decide in which direction to focus their research (Greene and Troyanskaya 2012).
Machine learning can evaluate and analyze large numbers of datasets and drive hypotheses forward (LeCun et al. 2015). A number of recent publications have demonstrated the capability of deep learning to accurately predict protein-protein interactions (Sun et al. 2017) and how genetic determinants of disease affect RNA splicing (Xiong et al. 2015).
Machine learning is also advancing image analysis due to its subjectivity and ability to learn to spot and exclude artifacts. Automatic image analysis enables more rapid screening and is particularly useful to analyze data from high-throughput phenotypic screens. Complex cellular phenotypes can be identified through the use of fluorescent labels of cellular components such as organelles.
Steve Finkbeiner (UCSF) has used deep-learning to identify with high accuracy, dead neurons from a neuronal culture that contains both live and dead cells. What makes this unique is the ability of deep-learning to perform this on unlabeled neuronal cultures, which would not be feasible for a human.
A.I. is also driving research forwards in different ways. Bio-Rad’s antibody search engine “A.I.den” uses cutting edge A.I. and machine learning technology to ensure that only useful and relevant results are returned. Researchers no longer need to spend lots of time scrolling through multiple search results to find the resources or products that they were searching for. Instead, A.I.den uses machine learning technology to return the most relevant results based on how other similar users have searched.
For example, as A.I.den has been developed to search for antibodies, when you look for “CD4”, it recognizes that “CD44” and “CD45” are different. This is unlike other search engines that group all of these results together based on partial matches containing “CD4”. The results returned are also dynamic, so it automatically presents the most likely matches for CD4 first. This means that users can quickly find the antibody they want.
The smart machine learning technology also enables correction for common spelling mistakes and typos, like “western botting”. This means that rather than returning no results, it intuitively predicts what someone is likely to be looking for, based on previous behavior from similar users. Additionally, as well as having application based filters, the technology can handle complicated search requests such as “cd4 antibody for flow cytometry”, showing only the antibodies and resources that meet these requirements. This means that in many cases, one search request can find exactly what is needed.
Christiansen EM et al. (2018). In Silico Labeling: Predicting Fluorescent Labels in Unlabeled Images. Cell, 173, 792-803.e19.
Greene CS and Troyanskaya OG (2012). Chapter 2: Data-Driven View of Disease Biology. PLoS Computational Biology, 8, e1002816.
LeCun Y et al. (2015). Deep learning. Nature, 521, 436-444.
Sun T et al. (2017). Sequence-based prediction of protein protein interaction using a deep-learning algorithm. BMC Bioinformatics, 18, 277.
Xiong HY et al. (2015). The human splicing code reveals new insights into the genetic determinants of disease. Science, 347, 1254806.