Semantic Web Applications and Tools for Life Sciences (SWAT4LS) awards 2015 best paper prize
A paper on Wikidata titled, "Wikidata: A platform for data integration and dissemination for the life sciences and beyond" won the 2015 SWAT4LS best paper prize. Dr. Elvira Mitraka, a Postdoctoral fellow from the University of Maryland, delivered the presentation on the winning paper on behalf of her co-authors: Dr. Lynn M. Schriml (University of Maryland), Andra Waagmeester (Micelio), Dr Sebastian Burgstaller-Muehlbacher (The Scripps Research Institute [TSRI]), Dr. Benjamin M. Good (TSRI), and Dr. Andrew I. Su (TSRI).
"Wikipedia is among the most visited sites on the internet," she noted. "Articles about medical topics were viewed more than 4.88 billion times in 2013, a number on par with nih.gov and significantly greater than WebMD."
Although Wikipedia primarily consists of unstructured text, a growing number of Wikipedia articles are becoming tightly linked with machine-readable structured data from Wikidata through the use of infoboxes. Wikidata's closest predecessor, DBpedia served as a global linking hub for the Semantic Web, and Wikidata is poised to surpass DBpedia due to its many advantages. Unlike DBpedia which derived its content by parsing infoboxes in Wikipedia, Wikidata can be edited directly, hence the changes are visible in real time. Since it's a database to begin with, no parsing is needed. And it contains a lot more information than DBpedia because it contains a lot of content that's not in Wikipedia.
"When we were first creating stub articles of human genes in Wikipedia, we were advised to limit our import activities to human genes of interest. Hence, we loaded less than 10,000 genes in Wikipedia—less than half the total number of human genes known at the time," remarked Dr. Andrew Su, an Associate Professor in charge of the Gene Wiki project.
Dr. Sebastian Burgstaller-Muehlbacher, postdoctoral research associate also in Dr. Su's team reported, "So far, we've added about 56,451 human and 73,086 mouse genes from NCBI Gene into Wikidata. We've also loaded 6,562 Disease Ontology concepts and 1,830 FDA-approved drugs using our bots."
The authors propose that Wikidata's increased use will help maximize the potential of the Semantic Web and allow for automatic, serendipitous, cross-continental data integration of information that is naturally highly distributed.
Andra Waagmeester, a data scientist at Micelio, provided the following biomedical example of a useful query already made possible thanks to Wikidata. "Using Wikidata's SPARQL endpoint, we are now able to determine clinically relevant drug-drug interactions known for e.g. methadone. The data that was needed to answer this query came from different groups working completely independently. Our team added or enhanced the drug items in Wikidata, while another team added the drug-drug interactions—and this happened without any direct coordination between our groups!"
"There are numerous biological and medical entities that still need to be added into Wikidata in order to create a powerful, freely-accessible, and highly integrated resource for biomedical knowledge," Dr. Benjamin Good, an Assistant Professor at TSRI largely responsible for spearheading the efforts from Dr. Su's lab concluded, "We hope you will join our effort."
Proceedings from SWAT4LS are now available at: http://ceur-ws.org/Vol-1546/
More information:
Mitraka E, Waagmeester A, Burgstaller-Muehlbacher S, Schriml LM, Su AI, and Good BM. Wikidata: A platform for data integration and dissemination for the life sciences and beyond. Semantic Web Applications and Tools for Life Sciences International Conference 2015. Cambridge, England. doi: http://dx.doi.org/10.1101/031971
Provided by Su lab at The Scripps Research Institute