My research focus is in accelerating development of new corpora for semantic role labels (SRL).
I'm investigating techniques for conducting active learning for semantic role labeling: How can we determine which sentences will most improve the model when annotated and added to our training data? This methodology enables us to improve annotation efficiency by selecting only the most informative sentences to annotate.
Additionally, I'm examining approaches for projecting semantic annotation cross-lingually: If we know what the semantic roles are in an English sentence, and we know the translation of that sentence, can we figure out which words to assign those roles to in the target language? These projected annotations may serve either as a starting point for manual annotation that will expedite the process, or as training data themselves.
I'm presently exploring these techniques specifically in regards to developing and expanding a Russian PropBank corpus.