Post-editing Analysis

From Memsource
Jump to: navigation, search

Introduction

The post-editing analysis in Memsource Cloud extends the traditional translation memory analysis to also include machine translation. It analyzes the MT post-editing effort for each segment and compares the machine translation output with the final post-edited translation (edit distance). Therefore, if the machine translation output was accepted without further editing (the linguist did not need to change it at all), it would come up as a 100% match in the analysis.

If, on the other hand, the linguist changes the MT output heavily, the match rate will be close to 0%. The score counting algorithm is identical to the one that we use to calculate the score of translation memory fuzzy matches. The only difference is that the post-editing analysis is based on the target. Therefore, the post-editing analysis must be, quite naturally, launched after the post-editing job has been completed.

A sample post-editing analysis: Post-editing-analysis.png


The post-editing analysis has two main components, which get combined together: The translation memory and machine translation analysis.

Machine Translation Analysis

  • When a user clicks in a segment, machine translation gets saved for that segment and is later used to calculate the machine translation match against the final post-edited translation

Translation Memory Analysis

  • When a user clicks in a segment, the current translation memory hit gets saved for that segment and is later used to calculate the translation memory match against the final post-edited translation
  • This means that if a user finds that a 100% TM match needs to be edited and modifies it accordingly, it will come across as a fuzzy match in the post-editing analysis (if the option Analyze TM post-editing has been selected)
  • For this reason in-context matches are not supported in the post-editing analysis, as each match is in-context by default
  • Because the analysis is based on TM matches that were available to a user in real time (or simply at the time of translation), it can also be very well used in a scenario in which multiple translators work in the same project, sharing the same TM and contributing to the TM as well as retrieving matches from it. The post-editing analysis will show for each segment who re-used matches from TM and who contributed them to the TM