c Xiang Ren - USC Computer Science

Research: machine learning and data-driven methods for knowledge acquisition from text, e.g., turning massive text data into machine-actionable structures.

googlescholar  xiangren@usc.edu (Prospective Students)

cv Curriculum Vitae   (last updated May 1, 2017)
RS Research Statement  |  Teaching Statement

github GitHub  | 
googlescholar Google Scholar  |  linkedin LinkedIn

I’m joining the faculty of USC Computer Science as an Assistant Professor in Spring 2018, and are actively recruiting new students.

I'm an incoming Assistant Professor in the Department of Computer Science at USC affliated with USC Machine Learning Center and USC ISI. Currently, I'm a visiting researcher at Stanford University working with Dan Jurafsky and Jure Leskovec. Before that, I'm a Ph.D. student in CS@UIUC where I work with Jiawei Han. I'm interested in computational methods and systems that extract structured, machine-actionable knowledge from massive text data. My thesis research recevied a Google PhD Fellowship, a Yahoo!-DAIS Research Excellence Award and a David J. Kuck Outstanding M.S. Thesis Award.


- I'm co-organizing the 1st Workshop on Knowledge Base Construction, Reasoning and Mining (KBCOM'18) co-located with WSDM'18 on Feb 9, 2018. We have invited speakers including Luna Dong, Oren Etzioni, Lise Getoor, Alon Halevy, Monica Lam, Chris Ré, Xifeng Yan and Luke Zettlemoyer. Stay tuned, Call for Papers coming soon!

- Automated knowledge base construction with indirect supervision: In many information extraction tasks, direct supervision in the form of manually-annotated text sequences is expensive to obtain but different kinds of indirect supervision (e.g., KB facts, hand-craft rules, user feedbacks) are much easier to collect at a large scale. Our SIGMOD 2017 and WWW 2017 tutorials summarize recent studies on effectively leveraging distant supervision and multi-tasking different extraction tasks .

- [NEW] Both human experts and public knowledge bases can provide (indirect, weak) supervision for information extraction (e.g., hand-crafted rules, distant supervision). Such indirect supervisions trades off label quality with the amount of labeled data one can obtain. How could we leverage these heterogenous supervisions (on same set of instances) in a principled way? We formulate a joint objective to unify representation learning and truth finding. Code and data are on Github.

- Indirection supervision may result in noisily- and partially-labeled data. This is especially challenging when dealing with a complex label space (e.g., a label hierarchy). We propose hierarchical partial-label embeddingn to overcome these issues.

News  cv

Aug 2017 - Talking about entity and relation typing in KDD'17 tutorial on Mining Entity-Relation-Attribute Structures from Massive Text Data.
Aug 2017 - Talking about text-rich recommendation models in KDD'17 tutorial on Context-Rich Recommendation: Integrating Links, Text, and Spatio-Temporal Dimensions.
Aug 2017 - One paper on multi-view network embedding is accepted to CIKM'17.
Jul 2017 - New work on Heterogeneous Supervision has been accepted to EMNLP'17!
May 2017 - Two research papers are accepted to KDD 2017!
Mar. 2017 - I receive a ACM Student Scholarship for attending ACM's Celebration of 50 Years of the Turing Award.
Mar. 2017 - I receive a WWW 2017 Outstanding Reviewer Award.