Topic: Choose No More: Combining Keyword Search and Supervised Learning


Speaker: Mr.Eugene Yang


Date: 6/17 (Wed.) 3:30-5:00PM


Location: Google Hangouts Meet

                  URL is


Eugene Yang is a computer science Ph.D. candidate at Georgetown University under the advice of Ophir Frieder, Jeremy Fineman, and David D. Lewis. He received a bachelor's degree in quantitative finance from National Tsing Hua University. His research focuses on total recall retrieval and technology-assisted review, especially in the legal applications. His interest includes Bayesian supervised learning, sequential decision problems, and the explainability of machine learning models.



Traditionally, supervised classification and information retrieval are considered as distinct problems with differing input. While classification requires a set of annotated data points, retrieval models only demand a textual query to rank the documents. Classification models, in contrast, once trained, sustain greater accuracy and efficiency at separating the wheat from the chaff. The obvious question is: Given both forms of information — textual keywords and labeled documents -- can we utilize both? Ignoring either is information loss; combining them is believed complicated. In this talk, within the domain of legal information processing, we develop an integration framework that combines both information types into a single model. The resulting approach capitalizes on the advantages of each information type, achieving a resource-efficient and accurate system. Ethical issues of machine learning within legal applications are likewise addressed.

earning models.