Towards a Unified Multi-Dimensional Evaluator for Text Generation

多个维度出发评价生成文本的质量，如一致性、流畅度等等。

每个维度的伪标注样本数量为30K，作者构建的数据集：

we first design specific rules for several commonly evaluated dimensions to construct pseudo data, and then combine them to train the evaluator.

任务形式：summary和dialogue。

实验验证：对比model有BLEU、METHOR、ROUGE、Bertscore....

人工标注的数据：TO verfify the proposed evaluator is qualifited, we need to calculated correlations with human scores in each benchamark.

Train the evaluator for 1-3 epochs. _Supervised method.

BARTSCORE: Evaluating Generated Text as Text Generation

Conditional text generation: for example,machine translation, so the goal is to generate a hypothesis (h = h1, · · · , hm) based on a given source text (s = s1, · · · , sn)

require human judgments to train (i.e., supervised me

上一篇：新媒体创业项目分析,新媒体创业目标新媒体创业项目大全新媒体创业包括哪些创业商机

下一篇：知乎没网络,创业知乎知乎没网络,创业知乎知乎没流量

相似度论文再回顾

Towards a Unified Multi-Dimensional Evaluator for Text Generation

BARTSCORE: Evaluating Generated Text as Text Generation

相关内容

热门资讯