Smith, Tony C. & Cleary, John G. (2003) Automatically linking MEDLINE abstracts to the Gene Ontology. Proceedings of the ISMB 2003 BioLINK Text Data Mining SIG, Brisbane, Australia. June.
Permanent Research Commons link: http://hdl.handle.net/10289/1723
Much has been written recently about the need for effective tools and methods for mining the wealth of information present in biomedical literature (Mack and Hehenberger, 2002; Blagosklonny and Pardee, 2001; Rindflesch et al., 2002)—the activity of conceptual biology. Keyword search engines operating over large electronic document stores (such as PubMed and the PNAS) offer some help, but there are fundamental obstacles that limit their effectiveness. In the first instance, there is no general consensus among scientists about the vernacular to be used when describing research about genes, proteins, drugs, diseases, tissues and therapies, making it very difficult to formulate a search query that retrieves the right documents. Secondly, finding relevant articles is just one aspect of the investigative process. A more fundamental goal is to establish links and relationships between facts existing in published literature in order to “validate current hypotheses or to generate new ones” (Barnes and Robertson, 2002)—something keyword search engines do little to support.
This article has been published in the Proceedings of the ISMB 2003 BioLINK Text Data Mining SIG, Brisbane, Australia. June.