You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What is the difference between your package's bleu implementation and sacrebleu implementation? I calculated the result differently in the two ways, Chinese expected, passed sacrebleu's zh tokenizer
The text was updated successfully, but these errors were encountered:
I believe there are some differences between the implementation and sacrebleu's. Actruly, testing with English has the same problem.
evaluate
importevaluatepredictions= ["hello there general kenobi", "foo bar foobar"]
references= [
["hello there general kenobi", "hello there !"],
["foo bar foobar"]
]
bleu=evaluate.load("bleu")
results=bleu.compute(predictions=predictions, references=references, smooth=False, max_order=4)
print(results)
fromsacrebleu.metricsimportBLEUpredictions= ["hello there general kenobi", "foo bar foobar"]
references= [
["hello there general kenobi", "hello there !"],
["foo bar foobar"]
]
bleu=BLEU(smooth_method="none", max_ngram_order=4, tokenize='13a')
results=bleu.corpus_score(predictions, references)
print(results)
What is the difference between your package's bleu implementation and sacrebleu implementation? I calculated the result differently in the two ways, Chinese expected, passed sacrebleu's zh tokenizer
The text was updated successfully, but these errors were encountered: