Changelog
This page lists the history of changes to Prodigy. Whenever a new update is available, you'll receive an email notification sent to the address specified at checkout. You can then download the new version via your personal download link.
v1.5.1
This update includes several bug fixes and stability improvements related to the new part-of-speech tagging recipes and the built-in pattern matcher model, as well as a better identification system for match patterns.
| new | Add --resume argument to ner.match to update matcher from dataset. |
| new | Use hashes as pattern IDs to allow updating existing matchers even if pattern files change across sessions. |
| fix | Make pos.teach and pos.batch-train work as expected with both fine-grained and coarse-grained part-of-speech tags. |
| fix | Fix bug in ner.iob-to-gold that'd cause export to fail. |
| fix | Small improvements to UI and web app stability. |
v1.5.0
This update includes new recipes for part-of-speech tagging, an experimental release of the new manual image labeling interface and a new mechanism for adding custom loaders, database connectors and recipes via Python entry points. We've also added validation for incoming streams and detailed error messages for incorrect task formats, enhanced the training options for sparse and gold-standard named entity data, and improved handling of newlines and formatting tokens in the manual NER interface.
| new | New recipes for part-of-speech tagging: pos.teach, pos.batch-train and pos.train-curve. |
| new | Experimental: Manual image annotation interface and image.manual recipe. |
| new | Add annotation task validation. Before the Prodigy server starts, your stream is checked against a schema to make sure it has the correct format. If not, Prodigy tells you what the problem is. |
| new | Allow adding custom recipes, databases and loaders via entry points. |
| new | Add --no-missing flag to ner.batch-train to assume all correct spans are in the gold annotation, and any spans not in the gold annotation are incorrect. This is especially useful when training from annotations collected with ner.manual or ner.make-gold. |
| new | Add --resume argument to terms.teach to update target vector from dataset. |
| new | Add "true" newlines to newline tokens ↵ manual interfaces. The behaviour can be turned off by setting "hide_true_newline_tokens": true. |
| new | Allow marking tokens as "disabled": true in manual interfaces. Disabled tokens can't be highlighted and can be used to assist annotators with formatting. |
| new | Converter recipe ner.iob-to-gold to convert IOB tags to Prodigy's JSONL. |
| fix | Disable and restore other pipeline components in batch-train recipes. |
| fix | Ensure seed terms are added to the dataset correctly. |
| fix | Fix bug that would cause web app to fail with annotation instructions. |
| fix | Make keyboard shortcuts in choice interface work as expected again. |
| fix | Add missing import and make image.test work out-of-the-box again. |
| doc | Add sections on Python entry points and document new recipes and interfaces. |
| doc | Fix various typos and inconsistencies. |
v1.4.2
This update includes various bug fixes and efficiency improvements.
| new | Allow custom HTML in classification interface. |
| new | Allow pre-defined selections in choice interface, e.g. "accept": [1, 3]. |
| fix | Improve memory usage of terms.teach. |
| fix | Fix data integrity error when dropping datasets using MySQL. |
| fix | Fix bug in error message of custom recipe validation introduced in v1.4.1. |
| fix | Resolve problem with image preloading in image interfaces. |
| fix | Make keyboard shortcuts work as expected in choice interface. |
| doc | Fix various typos and inconsistencies. |
v1.4.1
This update improves efficiency of the ner.batch-train recipe and fixes the handling of task and input hashes in the database methods and --exclude option. It also comes with various improvements to error messages and web app stability.
| new | Improve efficiency of ner.batch-train – up to 10× faster for some workloads! |
| fix | Fix problem that would cause text classification tasks created from pattern matches to not have a label assigned to the task. |
| fix | Ensure that --exclude logic is always applied after the stream is (re)hashed. |
| fix | Fix bug that would cause hashes to not be returned correctly by the database. |
| fix | Allow the "instructions" setting to be false or null. |
| fix | Improve error messages if recipe file is not valid and if dataset doesn't exist in terms.to-patterns. |
| fix | Various improvements to UI and web app stability. |
v1.4.0
This update includes a new annotation interface for relations and dependencies, as well as an experimental dep.teach recipe. textcat.teach now takes a file of match patterns instead of seed terms, and manual interfaces
now support lists of up to 30 labels with keyboard shortcuts. We've also
improved the customisation of various components, and added the Prodigy Cookbook.
| new | Dependency and relation annotation interface and recipes dep.teach, dep.batch-train and dep.train-curve recipes for training a dependency parsing model. Still experimental! |
| new | Allow using textcat.teach with a patterns file instead of seed terms. |
| new | Support list view and keyboard shortcuts for larger label sets in manual interfaces. |
| new | Add option to display modal with annotation instructions. |
| new | Allow skipping examples with mismatched tokenization in add_tokens. |
| new | Make swipe gestures optional via "swipe": true. |
| new | Allow overwriting the host and port via PRODIGY_HOST and PRODIGY_PORT environment variables. |
| new | Add split_sents_threshold config setting and --unsegmented command-line option to disable sentence segmentation. |
| new | Update NewsAPI loader to use v2. |
| fix | Prevent MySQL server from timing out between requests. |
| fix | Correctly port over spans in split_sentences preprocessor. |
| fix | Always add labels from examples and --labels in ner.batch-train and consistently allow loading label sets from a string or a text file. |
| fix | Fix issue that caused print recipes to not display colours when piped to less. |
| fix | Ensure that pre-set task meta isn't overwritten in the PatternMatcher. |
| fix | Show error message in the web app if view_id is invalid. |
| doc | Add live demo for new dep interface. |
| doc | Add Prodigy Cookbook with quick solutions to various tasks. |
| doc | Add glossary to "First Steps" workflow. |
| doc | Order recipes in PRODIGY_README.html table of contents by type. |
v1.3.0
This update introduces a new ner.make-gold recipe that lets you create gold-standard data faster by manually correcting a model's predictions. We've also added a new pos.make-gold recipe for annotating part-of-speech tags, as well as converters to create spaCy training data from Prodigy datasets.
| new | Improved ner.make-gold workflow: run a model over your text and manually correct the entities to create gold-standard data. |
| new | Add "ner_manual_label_style" option to display label set as list of dropdown (always uses dropdown for more thant 10 labels) and add number keyboard shortcuts to list of labels. |
| new | Experimental pos.make-gold recipe for manual POS annotation. |
| new | Experimental ner.gold-to-spacy and pos.gold-to-spacy converters. |
| new | Add option for custom label color schemes for NER and POS tagging. |
| new | Add UI option to "flag" tasks to bookmark them for later via "show_flag" setting and a flag icon and F keyboard shortcut. Add --flagged-only setting to db-out command. |
| new | Rename split_tokens pre-processor to add_tokens. |
| fix | Fix rendering and use icons for whitespace tokens in ner_manual. |
| fix | Fix rendering of RTL languages in manual interfaces via "writing_dir" setting. |
| fix | Overwrite database settings correctly when using connect(). |
| fix | Fix bug in logging timestamp and log minutes correctly. |
| fix | Only use colored CLI output if supported by user's terminal. |
| fix | Don't disable entity recognizer in textcat.batch-train. |
| doc | Document preprocessor components and models' batch_train methods. |
| doc | Fix various typos and add more examples. |
| doc | Add docstrings to internals so they can be inspected using help(). |
v1.2.0
This update introduces ner.manual, a new recipe and interface for manual NER annotation. You can now highlight one or more text spans per task and select the
entity label from a dropdown menu. To allow faster annotation and less
fiddly clicking, token boundaries are used to determine the entity spans when highlighting them. Note that this workflow replaces ner.mark and boundaries.
| new | ner.manual recipe and interface for manual NER annotation. |
| new | "card_css" option to inject custom CSS into annotation card. |
| new | Experimental "show_whitespace" for basic ner interface. |
| fix | Make --exclude argument and recipe option work as expected. |
| fix | Don't merge and modify NER spans before adding example to the database. |
| doc | Document API of PatternMatcher model. |
| doc | Improve formatting of available recipes in prodigy --help. |
| doc | Fix various typos and inconsistencies. |
v1.1.0
| new | Automatically add new entity labels in ner.batch-train. |
| new | Improve speed during NER training and allow setting the beam width via CLI. |
| new | Filter out ignored examples before creating training and evaluation sets. |
| new | Re-add improved version of ner.eval recipe. |
| new | Handle broken JSONL in Reddit loader. |
| new | Use spaCy model to assign labels in ner.print-stream. |
| doc | Small improvements to documentation. |