Named Entity Recognition

Tagging names, concepts or key phrases is a crucial task for Natural Language Understanding pipelines. The Prodigy annotation tool lets you label NER training data or improve an existing model's accuracy with ease.

Fast and flexible annotation

Prodigy’s web-based annotation app has been carefully designed to be as efficient as possible. By breaking complex tasks down into smaller units of work, your annotators stay focused on one decision at a time, giving you better data, faster.

The manual interface lets you label entities by highlighting spans of text by hand. Your annotations snap to token boundaries, and you can mark single-word entities by double clicking. To improve annotation speed and accuracy, you can use spaCy's powerful match patterns, a pretrained NER model or any other custom Python logic to pre-select spans.

Once your model is already accurate, the binary interface lets you fly through lots of inputs very quickly, by just saying “yes” or “no” to specific suggestions. It’s a great way to confirm likely predictions, reject unlikely ones, or focus only on possible mistakes.

Try it live and highlight entities!

This live demo requires JavaScript to be enabled.

Try it live and highlight entities!

This live demo requires JavaScript to be enabled.

Try it live and accept or reject!

This live demo requires JavaScript to be enabled.
patterns.jsonl{"pattern": [{"lower": "new"}, {"lower": "york"}], "label": "CITY"}
{"pattern": [{"lower": "berlin"}], "label": "CITY"}
This live demo requires JavaScript to be enabled.

Bootstrap with powerful patterns

Prodigy is a fully scriptable annotation tool, letting you automate as much as possible with custom rule-based logic. You don’t want to waste time labeling every instance of common entities like "New York" or "the United States" by hand. Instead, give Prodigy rules or a list of examples, review the entities in context and annotate the exceptions. As you annotate, a statistical model can learn to suggest similar entities, generalising beyond your initial patterns.

Focus on what the model is most uncertain about

Prodigy puts the model in the loop, so that it can actively participate in the training process, using what it already knows to figure out what to ask you next. The model learns as you go, based on the answers you provide. Most annotation tools avoid making any suggestions to the user, to avoid biasing the annotations. Prodigy takes the opposite approach: ask the user as little as possible.

View the documentation