This page lists the history of changes to Prodigy. Whenever a new update
is available, you'll receive an email notification sent to the address specified at checkout. You can then download the new version via your personal download link.
This update takes advantage of pre-built binary wheels for our
dependencies and speeds up the installation by up to 10 times! We've also
added official support for Python 3.7, made excluding the current dataset
the default behaviour, fixed issues related to patterns, text
classification and NER training and improved some internals to get
Prodigy ready for multi-user workflows and the upcoming Prodigy Scale.
Add official support and wheels for Python 3.7.
Use spaCy v2.0.16 to take advantage of pre-built wheels and allow up to 10 times faster installation.
Automatically exclude examples already present in the current dataset (e.g. make --exclude dataset the default behaviour). To disabled this feature, you can set "auto_exclude_current": false in your prodigy.json or recipe config.
This update includes several bug fixes and stability improvements
related to the new part-of-speech tagging recipes and the built-in
pattern matcher model, as well as a better identification system for
Add --resume argument to ner.match to update matcher from dataset.
Use hashes as pattern IDs to allow updating existing matchers even if pattern files change across sessions.
This update includes new recipes for part-of-speech tagging, an
experimental release of the new manual image labeling interface and
a new mechanism for adding custom loaders, database connectors
and recipes via Python entry points. We've also added validation for
incoming streams and detailed error messages for incorrect task formats,
enhanced the training options for sparse and gold-standard named entity
data, and improved handling of newlines and formatting tokens in the
manual NER interface.
Add annotation task validation. Before the Prodigy server starts, your stream is checked against a schema to make sure it has the correct format. If not, Prodigy tells you what the problem is.
Allow adding custom recipes, databases and loaders via entry points.
Add --no-missing flag to ner.batch-train to assume all correct spans are in the gold annotation, and any spans not in the gold annotation are incorrect. This is especially useful when training from annotations collected with ner.manual or ner.make-gold.
Add --resume argument to terms.teach to update target vector from dataset.
Add "true" newlines to newline tokens ↵ manual interfaces. The behaviour can be turned off by setting "hide_true_newline_tokens": true.
Allow marking tokens as "disabled": true in manual interfaces. Disabled tokens can't be highlighted and can be used to assist annotators with formatting.
This update improves efficiency of the ner.batch-train recipe and fixes the handling of task and input hashes in the database methods and --exclude option. It also comes with various improvements to error messages and web app stability.
Improve efficiency of ner.batch-train – up to 10× faster for some workloads!
Fix problem that would cause text classification tasks created from pattern matches to not have a label assigned to the task.
Ensure that --exclude logic is always applied after the stream is (re)hashed.
Fix bug that would cause hashes to not be returned correctly by the database.
Allow the "instructions" setting to be false or null.
Improve error messages if recipe file is not valid and if dataset doesn't exist in terms.to-patterns.
This update includes a new annotation interface for relations and dependencies, as well as an experimental dep.teach recipe. textcat.teach now takes a file of match patterns instead of seed terms, and manual interfaces
now support lists of up to 30 labels with keyboard shortcuts. We've also
improved the customisation of various components, and added the Prodigy Cookbook.
This update introduces a new ner.make-gold recipe that lets you create gold-standard data faster by manually correcting a model's predictions. We've also added a new pos.make-gold recipe for annotating part-of-speech tags, as well as converters to create spaCy training data from Prodigy datasets.
Improved ner.make-gold workflow: run a model over your text and manually correct the entities to create gold-standard data.
Add "ner_manual_label_style" option to display label set as list of dropdown (always uses dropdown for more thant 10 labels) and add number keyboard shortcuts to list of labels.
This update introduces ner.manual, a new recipe and interface for manual NER annotation. You can now highlight one or more text spans per task and select the
entity label from a dropdown menu. To allow faster annotation and less
fiddly clicking, token boundaries are used to determine the entity spans when highlighting them. Note that this workflow replaces ner.mark and boundaries.
ner.manual recipe and interface for manual NER annotation.
"card_css" option to inject custom CSS into annotation card.
Experimental "show_whitespace" for basic ner interface.
Make --exclude argument and recipe option work as expected.
Don't merge and modify NER spans before adding example to the database.
Document API of PatternMatcher model.
Improve formatting of available recipes in prodigy --help.