您的位置: 专家智库 > >

国家自然科学基金(61303082)

作品数:4 被引量:2H指数:1
相关作者:唐红艳苏劲松更多>>
相关机构:北京大学厦门大学更多>>
发文基金:国家自然科学基金国家教育部博士点基金福建省自然科学基金更多>>
相关领域:自动化与计算机技术语言文字理学更多>>

文献类型

  • 3篇中文期刊文章

领域

  • 3篇自动化与计算...
  • 1篇理学

主题

  • 1篇文本识别
  • 1篇广告
  • 1篇广告文本
  • 1篇TOPIC
  • 1篇TOPICA...
  • 1篇APPROA...
  • 1篇ED
  • 1篇LANGUA...
  • 1篇PIVOT
  • 1篇GRAPH
  • 1篇REORDE...

机构

  • 1篇北京大学
  • 1篇厦门大学

作者

  • 1篇苏劲松
  • 1篇唐红艳

传媒

  • 1篇厦门大学学报...
  • 1篇China ...
  • 1篇Journa...

年份

  • 1篇2017
  • 2篇2014
4 条 记 录,以下是 1-3
排序方式:
基于图的微博广告文本识别被引量:1
2017年
大量的微博广告影响了微博数据分析模型的使用.针对微博广告文本识别问题,利用基于图的半监督的标签传播算法,指导计算机从大量的非结构化的微博文本中自动识别出微博广告.通过对实验数据的评测,结果显示,当已有标签样本较少时,基于图的半监督的标签传播算法能够获得比有监督的支持向量机和朴素贝叶斯算法更好的性能.
罗斌唐红艳唐红艳秦悦苏劲松
Graph-based Lexicalized Reordering Models for Statistical Machine Translation
2014年
Lexicalized reordering models are very important components of phrasebased translation systems.By examining the reordering relationships between adjacent phrases,conventional methods learn these models from the word aligned bilingual corpus,while ignoring the effect of the number of adjacent bilingual phrases.In this paper,we propose a method to take the number of adjacent phrases into account for better estimation of reordering models.Instead of just checking whether there is one phrase adjacent to a given phrase,our method firstly uses a compact structure named reordering graph to represent all phrase segmentations of a parallel sentence,then the effect of the adjacent phrase number can be quantified in a forward-backward fashion,and finally incorporated into the estimation of reordering models.Experimental results on the NIST Chinese-English and WMT French-Spanish data sets show that our approach significantly outperforms the baseline method.
SU JinsongLIU YangLIU QunDONG Huailin
Topic-aware pivot language approach for statistical machine translation
2014年
The pivot language approach for statistical machine translation(SMT) is a good method to break the resource bottleneck for certain language pairs. However, in the implementation of conventional approaches, pivotside context information is far from fully utilized, resulting in erroneous estimations of translation probabilities. In this study, we propose two topic-aware pivot language approaches to use different levels of pivot-side context. The first method takes advantage of document-level context by assuming that the bridged phrase pairs should be similar in the document-level topic distributions. The second method focuses on the effect of local context. Central to this approach are that the phrase sense can be reflected by local context in the form of probabilistic topics, and that bridged phrase pairs should be compatible in the latent sense distributions. Then, we build an interpolated model bringing the above methods together to further enhance the system performance. Experimental results on French-Spanish and French-German translations using English as the pivot language demonstrate the effectiveness of topic-based context in pivot-based SMT.
Jin-song SUXiao-dong SHIYan-zhou HUANGYang LIUQing-qiang WUYi-dong CHENHuai-lin DONG
共1页<1>
聚类工具0