English | 中文

English | 中文

语言资源与智能大讲坛(一、二、三)
作者:
2016年12月02日

第一讲:基于语料库的机器翻译

主讲人:宗成庆 研究员、博导 中科院自动化所

时间:2016年12月7日(周三)14:30

地点:北京语言大学综合楼707会议室

内容:

近年来,基于语料库的机器翻译取得了长足的进步,但机器翻译的性能严重依赖于语料库的规模、质量和领域相关性,那么,如何通过计算机网络自动或半自动地快速获取大规模、高质量的双语对照语料,如何面向基于语料库的机器翻译系统标注双语对照语料,并使翻译引擎借助人工标注的语料提高翻译质量,成为当前机器翻译研究的瓶颈问题。本报告将介绍一种双语平行资源获取框架和双语平行句对及片段自动抽取方法,然后对目前机器翻译中存在的问题做简要分析,提出未来研究的初步设想。

主讲人简介:

宗成庆,中科院自动化所研究员、博士生导师。主要从事自然语言处理、机器翻译和文本内容分析等相关研究,主持国家项目10余项,发表学术论文150余篇,出版专著、译著各一部。现任国际计算语言学委员会(ICCL)永久委员、亚洲自然语言处理联合会(AFNLP)秘书长、中国中文信息学会常务理事和机器翻译专委会副主任、中国人工智能学会理事,担任学术期刊 ACM TALLIP 副主编、《自动化学报》副主编、IEEE Intelligent Systems、Machine Translation 和 JCST编委。曾任多个国际一流学术会议(ACL、COLING、IJCAI、AAAI 等)的程序委员会主席、组委会主席和领域主席等职务。曾获钱伟长中文信息处理科学技术奖一等奖、中国电子学会科技进步奖一等奖、国家科技进步奖二等奖和国务院颁发的政府特殊津贴。

第二讲:面向自然语言处理的通用语义表示

Towards a Universal Meaning Representation for Natural Language Processing

主讲人:薛念文 副教授 美国布兰戴斯大学

Dr. Nianwen Xue Brandeis University

时 间:2016年12月8日(周四)14:30-16:30

14:30-16:30 Dec 8th,2016 (Thursday)

地 点:北京语言大学综合楼707会议室

R707, Comprehensive Building,BLCU

内容:

讲座将介绍两个语义标注项目,即命题树库(PropBank)和中文命题树库(CPB)的述词论元标注,进而介绍命题树库中大量依赖述词论元标注的规范语言—抽象语义表示(AMR)。在AMR的基础上,通用语义表示就成为进一步的可能。

The speaker will start by introducing two semantic annotation projects, i.e. the predicate-argument structure annotation of the Proposition Bank (PropBank) and the Chinese Proposition Bank (CPB) projects. Then, he will discuss the Abstract Meaning Representation (AMR), a specification language that is largely based on the predicate-argument structure annotation in the Proposition Bank. AMR is a more plausible candidate based on which a universal meaning representation can be developed.

主讲人介绍:

薛念文博士,美国布兰戴斯大学计算机系及语言学项目副教授,《美国计算机学会亚洲汇刊和低资源语言信息处理》主编,国际计算语言学协会中文处理特别兴趣组(SIGHAN)副主席/主席候选人。曾任科罗拉多大学波尔得分校语言学系助理教授,宾夕法尼亚大学大学认知科学学院和计算机与信息科学系博士后。

主要研究领域为计算机语言学及自然语言处理,致力于开发语法、语义、时间和语篇信息标注语料库。同时主持汉语树库、汉语命题树库和汉语语篇树库的建设。在中文分词、句法和语义解析、共指关系、话语分析、机器翻译以及生物医学自然语言处理领域发表论文100余篇。

Dr. Nianwen Xue, Associate Professor in the Computer Science Department and the Language & Linguistics Program at Brandeis University, Editor-in-Chief of the ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP), Vice Chair/Chair-elect of SIGHAN, an ACL special interest group in Chinese language processing. He was a research assistant professor in the Department of Linguistics and the Center for Computational Language and Education Research (CLEAR) at the University of Colorado at Boulder. He pursued his postdoctoral research in the Institute for Research in Cognitive Science and the Department of Computer and Information Science at the University of Pennsylvania.

Dr. Xue has broad research interests in computational linguistics and natural language processing. He has devoted substantial efforts to developing linguistic corpora annotated with syntactic, semantic, temporal and discourse information that are crucial resources in the field of natural language processing, and led the development of the Chinese TreeBank, Chinese Proposition Bank, and the Chinese Discourse TreeBank. He has published over 100 papers in the areas of Chinese word segmentation, syntactic and semantic parsing, coreference, discourse analysis, machine translation as well as biomedical natural language processing.

第三讲:面向自然语言处理的通用语义表示

Towards a Universal Meaning Representation for Natural Language Processing

主讲人:薛念文 副教授 美国布兰戴斯大学

Dr. Nianwen Xue Brandeis University

时 间:2016年12月8日(周五)14:30-16:30

14:30-16:30 Dec 8th,2016 (Friday)

地 点:北京语言大学综合楼707会议室

R707, Comprehensive Building,BLCU

内容:

主讲人将分享他在对宾州语篇树库中文语篇关系体系标注及过去两年组织计算自然语言学习大会(CoNLL)的浅层语篇分析共享任务的经验,从而提出语篇层面上的依存结构可实现表示和可扩展性间的平衡。最后,主讲人将展示短信会话中的语篇和对话结构的标注成果。

It is important to take into account both scalability and expressiveness when designing a discourse annotation framework. The speaker will share his experience in annotating discourse relations in Chinese text in the Penn Discourse TreeBank framework, as well as the experience in running the CoNLL Shared Task on Shallow Discourse Parsing in the last two years. Based on these experiences, he argues that a dependency structure at the discourse level captures the right balance between scalability and expressiveness. Finally he will present his work on annotating the discourse and dialogue structure of SMS message conversations.

主讲人介绍:

薛念文博士,美国布兰戴斯大学计算机系及语言学项目副教授,《美国计算机学会亚洲汇刊和低资源语言信息处理》主编,国际计算语言学协会中文处理特别兴趣组(SIGHAN)副主席/主席候选人。曾任科罗拉多大学波尔得分校语言学系助理教授,宾夕法尼亚大学大学认知科学学院和计算机与信息科学系博士后。

主要研究领域为计算机语言学及自然语言处理,致力于开发语法、语义、时间和语篇信息标注语料库。同时主持汉语树库、汉语命题树库和汉语语篇树库的建设。在中文分词、句法和语义解析、共指关系、话语分析、机器翻译以及生物医学自然语言处理领域发表论文100余篇。

Dr. Nianwen Xue, Associate Professor in the Computer Science Department and the Language & Linguistics Program at Brandeis University, Editor-in-Chief of the ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP), Vice Chair/Chair-elect of SIGHAN, an ACL special interest group in Chinese language processing. He was a research assistant professor in the Department of Linguistics and the Center for Computational Language and Education Research (CLEAR) at the University of Colorado at Boulder. He pursued his postdoctoral research in the Institute for Research in Cognitive Science and the Department of Computer and Information Science at the University of Pennsylvania.

Dr. Xue has broad research interests in computational linguistics and natural language processing. He has devoted substantial efforts to developing linguistic corpora annotated with syntactic, semantic, temporal and discourse information that are crucial resources in the field of natural language processing, and led the development of the Chinese TreeBank, Chinese Proposition Bank, and the Chinese Discourse TreeBank. He has published over 100 papers in the areas of Chinese word segmentation, syntactic and semantic parsing, coreference, discourse analysis, machine translation as well as biomedical natural language processing.

主办:国家语言资源监测与研究平面媒体中心 语言资源高精尖创新中心