Radically efficient machine teaching.
An annotation tool powered
by active learning.

From the makers of spaCy
pip install ./prodigy.whl
Successfully installed prodigy

prodigy ner.manual reviews_ner en_core_web_sm ./data.jsonl --label PRODUCT,PERSON,ORG

✨ Starting the web server on port 8080...
Open the app in your browser and start annotating!

Train a new AI model in hours

Prodigy is a scriptable annotation tool so efficient that data scientists can do the annotation themselves, enabling a new level of rapid iteration.

Today’s transfer learning technologies mean you can train production-quality models with very few examples. With Prodigy you can take full advantage of modern machine learning by adopting a more agile approach to data collection. You'll move faster, be more independent and ship far more successful projects.

How it works

The missing piece in your data science workflow

Prodigy brings together state-of-the-art insights from machine learning and user experience. With its continuous active learning system, you're only asked to annotate examples the model does not already know the answer to. The web application is powerful, extensible and follows modern UX principles. The secret is very simple: it's designed to help you focus on one decision at a time and keep you clicking – like Tinder for data.

Everyone knows data scientists should spend more time looking at their data. When good habits are hard to form, the trick is to remove the friction. Prodigy makes the right thing easy, encouraging you to spend more time understanding your problem and interpreting your results.

Try the demo

Try it live and highlight entities!

This live demo requires JavaScript to be enabled.

Try it live and select text categories!

This live demo requires JavaScript to be enabled.

Try it live and draw bounding boxes!

This live demo requires JavaScript to be enabled.

Try it live and type some text!

This live demo requires JavaScript to be enabled.

Prodigy users include

Try out new ideas quickly

Annotation is usually the part where projects stall. Instead of having an idea and trying it out, you start scheduling meetings, writing specifications and dealing with quality control. With Prodigy, you can have an idea over breakfast and get your first results by lunch. Once the model is trained, you can export it as a versioned Python package, giving you a smooth path from prototype to production.

Read more

What others say

Case Study

Journalism
To facilitate trust, human-in-the-loop workflows are widespread in media applications as stakeholders require the ability to teach and to evaluate models through human-AI interfaces. For their AI projects, the Guardian’s data science team decided to use Prodigy.

Cheyanne Baird

NLP Research Scientist
Prodigy's design aspect was key. [With my previous annotation tools], I would get a lot of feedback from annotators, saying 'it's really hard, because I have to scroll and scroll and scroll to see the labels. There's too many labels. There's too many options.' When I was looking at Prodigy I liked it because you could customize it.

Andy Halterman

Researcher
A lack of labeled data held geoparsing back for years. It took a week to fix that with Prodigy.

Raphael Cohen

Head of Research
Prodigy is by far the best ROI we had on any tool!

Case Study

Finance
Posh focuses on developing custom NLP models trained on real-world banking conversations and custom models for each client’s unique customer base and product offering. To get their NLP models working effectively, the team needed to emphasize human annotation and experimentation, which is why Posh turned to Prodigy.

User Survey Participant

ML Engineer
I really love being able to do almost everything in Python, it means that team members with no front end experience can create tasks super easily.

User Survey Participant

ML Engineer
Prodigy gives you solutions for the problems that you did not even know you have.

Antonio Polo de Alvarado

ML Engineer
I have been working with Prodigy these last few weeks and I can say that it is probably (if not the best) one of the best NLP tools.

Fully scriptable and extensible

Prodigy is fully scriptable, and slots neatly into the rest of your Python-based data science workflow. As the makers of spaCy, a popular library for Natural Language Processing, we understand how to make tools programmers love. The simple secret is this: programmers want to be able to program. Good developer tools need to let you in, not lock you out. That's why Prodigy comes with a rich Python API, elegant command-line integration, and a super productive Jupyter extension. Using custom recipe scripts, you can adapt Prodigy to read and write data however you like, and plug in custom models using any of your favourite frameworks.

recipe.pyimport prodigy
from prodigy.components.stream import get_stream

@prodigy.recipe("custom")
def custom_recipe(dataset, source):
    stream = get_stream(source)
    return {
        "dataset": dataset,
        "stream": stream,
        "view_id": "classification"
    }

Command-line usage

prodigycustommy_dataset./data.jsonl-F recipe.py

Browse features