Evaluate your generative models
Many of the most exciting capabilities of today's neural network models produce outputs that aren't simply "right" or "wrong". Prodigy makes it easy to conduct rigorous manual evaluations of these models in just a few minutes, using randomised A/B testing. You won't know which model produced which output, to make sure you aren't biased when marking which output is better. As soon as you exit the server, Prodigy will show you the result. A full log of your decisions is available, allowing anyone to reproduce the evaluation.
Spelling and grammar correction
Spelling and grammar correction is another example of a task
where the system has a range of acceptable outputs, making it
hard to automatically evaluate against a single "gold standard". The compact
diff view is useful for this task, because the outputs of the two systems are usually very
similar to the input.