The MITRE Corporation
Getting off the gold standard?
The concept of a “gold standard” was initially developed as a monetary standard. More recently, it has come to mean a reliable method of assessing quality: it enables efficient, automated comparison of systems against each other and against human performance. This approach has driven great progress in many fields, including human language processing, biology and medicine. However the reliance on a gold standard also imposes significant limitations. There are well known issues, including its static nature (it’s not suitable for evaluating interactive systems), and the cost of recruiting experts and achieving sufficient inter-annotator reliability. But underlying these issues are more fundamental questions: Does a gold standard represent expert performance on a task – and if so, on what (real) task and who are the experts? What does the score against a gold standard represent and how should it be interpreted? If a gold standard is based on human expert judgments, what to do when experts don’t agree? This talk will explore these issues and suggest some alternative approaches, including use of crowdsourcing, sliver standards and interactive learning systems.
U.S. National Library of Medicine
Extracting information about drugs from the literature and drug labels
In this talk, I will present three collections generated to expedite development of new approaches to extraction of information about drugs from the literature and drug labels. We annotated about 700 excerpts from full text articles with complete drug prescription information and established a baseline approach to extraction of drug names, forms, and doses, as well as frequency, duration, routs and reasons for drug administration.
In collaboration with the Food and Drug Administration (FDA), we have developed two collections of Structured Product Labels (SPLs) that provide official descriptions of drugs. SPLs in the first dataset were annotated with Adverse Drug Reactions and in the second dataset with Drug-Drug Interactions. I will present the results of two Text Analysis Conference (TAC) challenges enabled by these datasets and conclude with the remaining challenges in drug information extraction.
7 years BioASQ: Lessons learnt and the road ahead
The BioASQ challenge on large-scale biomedical semantic indexing and question answering is running for the 7th time in 2019. In these 7 years, a large number of academic and industrial research groups from around the world have joint the BioASQ community. Through their participation in the challenge, they have pushed the state of the art in semantic indexing and question answering to new levels. A notable example of the influence of BioASQ is the effect that it had on the way biomedical articles are indexed by the National Library of Medicine. One other significant contribution of BioASQ is the curation of a corpus of biomedical questions from a team of experts, who have also provided ground truth answers and supporting material, such as related documents. The BioASQ corpus will exceed 3000 questions this year.
In this talk, we will look back at the progress made during the last 7 years in the various tasks of the BioASQ challenge. We will highlight the main trends in technology and their effect in the performance of the participating systems. We will also revisit the main decisions that were made and their effect in the running of the challenge. Finally, we will present our short- and mid-term plans for the future, aiming to receive feedback and initiate a discussion for collaboration with related community efforts.
University of Grenoble
CLEF eHealth: an evaluation challenge to improve access to health information for patients, their next-of-kins and clinical staff
In today’s information overloaded society it is increasingly difficult to retrieve and digest valid and relevant information to make health-centered decisions. Medical content is becoming available electronically in a variety of forms ranging from patient records and medical dossiers, scientific publications and health-related websites to medical-related topics shared across social networks. Laypeople, clinicians and policy-makers need to easily retrieve, and make sense of medical content to support their decision making. Information retrieval systems have been commonly used as a means to access health information available online. However, the reliability, quality, and suitability of the information for the target audience varies greatly while high recall or coverage, that is finding all relevant information about a topic, is often as important as high precision, if not more. Furthermore, the information seekers in the health domain also experience difficulties in expressing their information needs as search queries.
CLEF eHealth aims to bring together researchers working on related information access topics and provide them with datasets to work with and validate the outcomes. In this talk I will present the CLEF eHealth evaluation lab, which organizes tasks to evaluate information extraction and information retrieval on medical and biomedical data.
LIMSI, CNRS, Université Paris Saclay
Clinical Natural language Processing in languages other than English
Natural Language Processing applied to clinical text or aimed at a clinical outcome has been thriving in recent years. The ability to analyze clinical text in languages other than English opens access to important medical data concerning cohorts of patients who are treated in countries where English is not the official language. In this presentation, I will provide an overview of clinical NLP in Languages other than English, including shared tasks. While some studies focus on a particular language and clinical application, others take a more global approach and try to offer multi-lingual solutions. I will highlight specific studies that illustrate the specificity of dealing with one or multiple languages other than English. Finally, I will conclude with desiderata for future work engaging the community on multilingual clinical NLP.