Usage

Custom Interfaces

Prodigy ships with a range of built-in annotation interfaces for annotating text, images and other content. You can also put together fully custom solutions by combining interfaces and adding custom HTML, CSS and JavaScript.

Combining interfaces with blocks New: 1.9

Blocks are a new and exciting feature that let you freely combine different annotation interfaces. For example, you can add a text_input field to the ner_manual interface to collect additional free-form comments from the annotators. Or you can add custom html blocks before and after the image_manual UI or a multiple choice interface. Here’s an example:

This live demo requires JavaScript to be enabled.

View recipe example

Blocks are defined as a list of dictionaries in the "config" returned by your recipe and will be rendered in order. They all use the data present in the current annotation task, unless you override certain properties in the block. Each block needs to define at least a "view_id", mapping to one of the existing annotation interfaces.

recipe.py (excerpt)return {
    "dataset": dataset,
    "stream": stream,
    "view_id": "blocks",
    "config": {
        "blocks": [
            {"view_id": "ner_manual"},
            {"view_id": "choice", "text": None},
            {"view_id": "text_input", "field_rows": 3},
        ]
    }
}

In the above example, the "text" property of the second block is overwritten and set to None, so it won’t render any text. That’s because both ner_manual and choice render the "text" of an annotation task if present, so it would show up twice, once in each block. You typically also want to define text_input settings with the block, so you can have multiple inputs if needed and don’t need to store the presentational settings with each annotation task.

Aside from the task content, you can also define the following config settings with each block:

html_templateHTML template with variables to render the task content. Setting it in the block allows you to have multiple custom HTML blocks rendering different content.
labelsLabel set used in the ner_manual and image_manual interfaces. Setting it in the block is a bit cleaner and technically allows you to create blocks using both interfaces together with different labels (although not necessarily recommended).

The annotation task, i.e. what your stream sends out, should contain the content you’re annotating and what you want to save in the database. The config and overrides in the blocks are applied to all tasks, so they should only really include presentational settings, like the HTML template to use, the ID or placeholder of input fields or overrides to prevent the same content from being displayed by several blocks.

Multiple blocks of the same type are only supported for the text_input and html interface. For text input blocks, you can use the "field_id" to assign unique names to your fields and make sure they all write the user input to different keys. For HTML blocks, you can use different "html_template" values that reference different keys of your task.

Example[
  {"view_id": "text_input", "field_id": "user_input_a", "field_rows": 1},
  {"view_id": "text_input", "field_id": "user_input_b", "field_rows": 3},
  {"view_id": "html", "html_template": "<strong>{{content_a}}</strong>"},
  {"view_id": "html", "html_template": "<em>{{content_b}}</em>"}
]

Custom interfaces with HTML, CSS and JavaScript

The html interface lets you render any plain HTML content, specified as the "html" key in the annotation task.

JSON task format{"html": "<img src='image.jpg'>"}

If you don’t want to include the full markup in every task, you can also specify a html_template in your global or the project’s prodigy.json config file. Prodigy uses Mustache to render the templates, and makes all task properties available as variables wrapped in double curly braces, e.g. {{text}}. Using HTML templates, you can visualize complex, nested data without having to convert your input to match Prodigy’s format. When you annotate the task, Prodigy will simply add an "answer" key.

For example, let’s say your annotation tasks look like this:

JSON task format{
  "title": "This is a title",
  "image": "https://images.unsplash.com/photo-1472148439583-1f4cf81b80e0?ixlib=rb-0.3.5&q=80&fm=jpg&crop=entropy&cs=tinysrgb&w=400&fit=max&s=21dba460262331eb682c38680f7f1135",
  "data": {"value": "This is a value"},
  "meta": {
    "by": "Peter Nguyen",
    "url": "https://unsplash.com/@peterng1618"
  },
  "html": " "
}

You could then write the following HTML template that has access to the task properties as template variables, including the nested data.value:

HTML template<h2>{{title}}</h2>
<strong>{{data.value}}</strong>
<br />
<img src="{{image}}" />
This live demo requires JavaScript to be enabled.

All theme settings will also be available to the custom interface via the {{theme}} variable. This lets you re-use Prodigy’s color and styles, without having to hard-code them into your template. For example:

<mark style="background: {{theme.bgHighlight}}">{{text}}</mark>

JavaScript and CSS New: 1.7

Using custom CSS

The "global_css" config setting lets you pass in a string of CSS overrides that will be added to the global scope of the app. As of v1.14.5 you can also mount a directory with custom css code using "global_css_dir" or pass a list of remote css files to load using "global_css_urls" config setting. This lets you customize styles beyond just the color theme. As of v1.7.0, the Prodigy app exposes the following human-readable class names:

Class nameDescription
.prodigy-rootRoot element the app is rendered into.
.prodigy-buttonsRow of action buttons (accept, reject, ignore, undo).
.prodigy-button-acceptNew: 1.9 The accept button.
.prodigy-button-rejectNew: 1.9 The reject button.
.prodigy-button-ignoreNew: 1.9 The ignore button.
.prodigy-button-undoNew: 1.9 The undo button.
.prodigy-annotatorNew: 1.9 Main container of the annotation UI (excluding the sidebar).
.prodigy-sidebar-wrapperNew: 1.9 Container of the sidebar.
.prodigy-sidebarNew: 1.9 The sidebar.
.prodigy-sidebar-titleNew: 1.9 The sidebar title bar.
.prodigy-containerContainer of the annotation card (including title, content, meta).
.prodigy-contentMain content of the annotation card (containing the text or image).
.prodigy-titleTitle bar containing the label(s).
.prodigy-title-wrapperSticky wrapper around the title bar containing the labels.
.prodigy-metaMeta information in the bottom right corner of the annotation card.
.prodigy-text-inputText input fields (<input> or <textarea>) created by the text_input interface.
.prodigy-spansNew: 1.10.8 Spans container in interfaces like ner and ner_manual.
.prodigy-labelsNew: 1.11 Container of selectable labels, e.g. in ner_manual UI.
.prodigy-labelNew: 1.11 Individual selectable label, e.g. in ner_manual UI.
.prodigy-optionsNew: 1.11 Container for choice options in choice UI.
.prodigy-optionNew: 1.11 Individual choice option in choice UI.

In addition to the class names, some element also expose specific data attributes. This lets you target them based on their value, e.g. .prodigy-root[data-prodigy-view-id="ner"].

Data attributeElementDescription
data-prodigy-view-id.prodigy-rootThe name of the current interface, i.e. "ner_manual".
data-prodigy-recipe.prodigy-rootThe name of the current recipe, i.e. ner.manual or terms.teach.
data-prodigy-label.prodigy-labelNew: 1.11 The name of a selectable label, e.g. "PERSON".
data-prodigy-option.prodigy-optionNew: 1.11 The string value of a choice option, e.g. "1".

Data attributes also allow you to implement custom styling specific to individual recipes or interfaces in the same global stylesheet. For example:

[data-prodigy-view-id='ner_manual'] .prodigy-title {
  /* Change background for the title/labels bar only in the manual NER interface */
  background: green;
}

[data-prodigy-recipe='custom-recipe'] .prodigy-buttons {
  /* Hide the action buttons only in a custom recipe "custom-recipe"  */
  display: none;
}

Using custom JavaScript

As of v1.7.0, the "javascript" config setting is available across all interfaces and lets you pass in a string of JavaScript code that will be executed in the global scope. As of v1.14.5 you can also mount a directory with custom javascript code using "javasript_dir" or pass a list of remote javascript files to load using "javascript_urls" config setting. This lets you implement fully custom interfaces that modify and interact with the current incoming task. The following internals are exposed via the global window.prodigy object:

PropertyTypeDescription
viewIdstringThe ID of the current interface, e.g. 'ner_manual'.
configobjectThe user configuration settings (e.g. prodigy.json plus recipe config).
contentobjectThe content of the current task (e.g. {'text': 'Some text'}).
themeobjectTheme variables like colors.
updatefunctionUpdate the current task. Takes an object with the updates and performs a shallow (!) merge.
answerfunctionAnswer the current task. Takes the string value of the answer, e.g. 'accept'.
eventfunctionNew: 1.14.1 Call into a custom event_hook defined in a custom Prodigy recipe.

Here’s a simple example of a custom HTML template and JavaScript function that allows toggling a button to change the task text from uppercase to lowercase, and vice versa. For more details and inspiration, check out this thread on the forum.

HTML Template<button class="custom-button" onClick="updateText()">
  👇 Text to uppercase
</button>
<br />
<strong>{{text}}</strong>
JavaScriptlet upper = false

function updateText() {
  const text = window.prodigy.content.text
  const newText = !upper ? text.toUpperCase() : text.toLowerCase()
  window.prodigy.update({ text: newText })
  upper = !upper
  document.querySelector('.custom-button').textContent =
    '👇 Text to ' + (upper ? 'lowercase' : 'uppercase')
}

You can also listen to the following custom events fired by the app:

EventDescription
prodigymountThe app has mounted.
prodigyupdateThe UI was updated.
prodigyanswerThe user answered a question.
prodigyundoNew: 1.8.4 The user hit undo and re-added a task to the queue.
prodigyspanselectedA span was selected in the manual NER interface.
prodigyendNew: 1.10.8 No more tasks are available.
prodigyscriptloadedNew: 1.14.5 A custom JavaScript file finished loading.

Some events expose additional data via the details property. For example, the prodigyanswer event exposes the task that was answered, as well as the answer that was chosen: 'accept', 'reject' or 'ignore'.

JavaScriptdocument.addEventListener('prodigyanswer', event => {
  const { answer, task } = event.detail
  console.log('The answer was: ', answer)
})

Adding remote JavaScript and CSS Files New: 1.14.5

You can also add remote JavaScript and CSS Files to pull in your favorite Front End libraries to help build your perfect Prodigy interface. You can also mount local CSS or JS directories to the Prodigy FastAPI webapp and dynamically load your custom styles and JavaScript without having to inline JavaScript/CSS into Python strings inside your recipe file.

If you want you can even add a full-fledged framework like HTMX to your custom recipe and create a fully dynamic interface with Custom Events.

recipe.py
import prodigy
from prodigy.components.stream import get_stream


@prodigy.recipe("my_recipe")
def my_recipe(dataset: str, source: str):

    stream = get_stream(source)

    return {
        "view_id": "html",
        "dataset": dataset,
        "stream": stream,
        "config": {
            "html_template": "my_custom_template.html"
            # Load remote JavaScript URLs in order, `global_css_urls` works the same way
            "javascript_urls": [
                "https://unpkg.com/my_custom_js"
            ],
            # Mount all CSS files in a local directory, `javascript_dir` works the same way
            "global_css_dir": Path(__file__).parent / "static_dir/css"
        }
    }

Custom Events New: 1.14.1

Prodigy has always emphasized being a scriptable, customizable tool. Custom Events allow even more customizability through interactivity. Custom events are registered functions that can be returned from a recipe and called via the Prodigy frontend.

As a simple example, let’s consider the Customer Feedback Sentiment Example in the previous section on Custom Recipes. You have examples of customer feedback and support emails to label with sentiment categories of “happy”, “sad”, “angry” or “neutral”.

Let’s expand on this scenario and imagine we have this Prodigy annotation session deployed on a secured cloud server with limited access rights. We have a scheduled job that saves new customer feedback each day to our cloud server but annotators don’t have access to the server to restart Prodigy. Using event hooks, we can implement a simple button to refresh the model from the Prodigy UI.

To start, we’ll launch our annotation server with the following data file.

feedback.jsonl{"text": "Thanks for your great work – really made my day!"}
{"text": "Worst experience ever, never ordering from here again."}

And a slightly modified version of the recipe from the Custom Recipes section.

recipe.pyfrom pathlib import Path

import prodigy
from prodigy.core import Controller, ControllerComponentsDict
from prodigy.components.stream import Stream, get_stream
from .model import load_model


@prodigy.recipe(
    "sentiment-refresh",
    dataset=("The dataset to save to", "positional", None, str),
    spacy_model=("Model to load", "positional", None, str),
    source=("Path to texts", "positional", None, Path),
)
def sentiment(dataset: str, spacy_model: str, source: Path) -> ControllerComponentsDict:
    """Annotate the sentiment of texts using different mood options."""
    nlp = load_model(spacy_model)            # load in the mock TextCat model
    stream = get_stream(source, rehash=True) # load in the JSONL file
    stream.apply(add_options)               # add options to each task
    stream.apply(add_predictions, nlp=nlp)  # add predictions to each task

    return {
        "dataset": dataset,
        "stream": stream,
        "view_id": "choice"
    }


def add_options(stream):
    # Helper function to add options to every task in a stream
    options = [
        {"id": "happy", "text": "😀 happy"},
        {"id": "sad", "text": "😢 sad"},
        {"id": "angry", "text": "😠 angry"},
        {"id": "neutral", "text": "😶 neutral"},
    ]
    for task in stream:
        task["options"] = options
        yield task


def add_predictions(stream, nlp: Language):
    tuples = ((eg["text"], eg) for eg in stream)
    for doc, task in nlp.pipe(tuples, as_tuples=True):
        label = list(doc.cats.keys())[0]
        task["accept"] = [label]
        yield task

For this example, we want to guarantee that changing models changes our predictions so we know it’s working. Let’s note the load_model function above. We’ll define this in a separate file and call it model.py.

The following code will handle loading 2 mock spaCy models that each have a ”textcat” component. One always assigns “happy” label, the other always assigns the “sad” label. In reality these could just be 2 separate models on disk, or 2 packaged spaCy models and you could adjust the load_model util accordingly.

model.pyimport spacy
from spacy.language import Language

# Mock spaCy Model utils, the same event logic can be
# applied to load models from disk
# --------------------------
@Language.component("textcat_mocker_happy")
def textcat_mocker_happy(doc):
    doc.cats = {"happy": 0.9}
    return doc


@Language.component("textcat_mocker_sad")
def textcat_mocker_sad(doc):
    doc.cats = {"sad": 0.9}
    return doc


def load_model(model_name: str):
    # Load the mock model, you'd want to take
    # a model_path and do a standard `spacy.load()` instead
    nlp = spacy.blank("en")
    nlp.add_pipe(model_name)
    return nlp

This will setup a standard multi-choice annotation interface but with one of the values pre-selected based on the currently selected model. We’re using mock textcat models for simplicity but you could easily use 2 or more packaged spacy models/models saved to disk with the same controls.

Now we’ll add a Recipe Event Hook to change the selected model, and rerun the model over our stream of data.

recipe.pyfrom pathlib import Path

import prodigy
from prodigy.core import Controller, ControllerComponentsDict
from prodigy.components.stream import Stream, get_stream
from .model import load_model


@prodigy.recipe(
    "sentiment-refresh",
    dataset=("The dataset to save to", "positional", None, str),
    spacy_model=("Model to load", "positional", None, str),
    source=("Path to texts", "positional", None, Path),
)
def sentiment(dataset: str, spacy_model: str, source: Path) -> ControllerComponentsDict:
    """Annotate the sentiment of texts using different mood options."""
    nlp = load_model(spacy_model)            # load in the mock TextCat model
    stream = get_stream(source, rehash=True) # load in the JSONL file
    stream.apply(_add_options)               # add options to each task
    stream.apply(_add_predictions, nlp=nlp)  # add predictions to each task

    return {
        "dataset": dataset,
        "stream": stream,
        "view_id": "choice",
        "event_hooks": {
            # event_hooks is a Dict mapping the registered event name,
            # to a valid event handler.
            "refresh_model": refresh_model_event_handler
        }
    }


def refresh_model_event_handler(
    controller: Controller, *, example: dict, new_model_name: str
) -> dict:
    """A Recipe Event Hook Handler accepts the Prodigy Controller as
    the first positional argument and then any number of keyword arguments.
    In this case we'll pass the current example and return an updated
    example which we'll use to update the Prodigy UI in realtime.
    """
    new_nlp = load_model(new_model_name)

    # Remove the old _add_predictions function
    controller.stream._wrappers.pop()
    # Update the stream with predictions for the new model
    controller.stream = controller.stream.apply(add_predictions, nlp=new_nlp)

    # Update the current example reactively so the page doesn't need to be reloaded
    updated_example = add_predictions(iter([example]), nlp=new_nlp)
    return next(updated_example)


def add_options(stream):
    # Helper function to add options to every task in a stream
    options = [
        {"id": "happy", "text": "😀 happy"},
        {"id": "sad", "text": "😢 sad"},
        {"id": "angry", "text": "😠 angry"},
        {"id": "neutral", "text": "😶 neutral"},
    ]
    for task in stream:
        task["options"] = options
        yield task


def add_predictions(stream, nlp: Language):
    tuples = ((eg["text"], eg) for eg in stream)
    for doc, task in nlp.pipe(tuples, as_tuples=True):
        label = list(doc.cats.keys())[0]
        task["accept"] = [label]
        yield task

Finally, in order to actually call our event handler we need to add a small amount of custom JavaScript to this recipe. We’ll also add a custom HTML template with a single input where you can enter the new name of the model to use.

Custom HTML

The HTML template looks like this:

<link
  rel="stylesheet"
  href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.1.2/css/all.min.css"
  integrity="sha512-1sCRPdkRXhBV2PBLUdRb4tMg1w2YPf37qatUFeS7zlBy7jJI8Lf4VHwWfZZfpXtYSLy85pkm9GaYVYMfw5BC1A=="
  crossorigin="anonymous"
  referrerpolicy="no-referrer"
/>
<button id="refreshButton" onclick="refreshData()">
  Refresh Data Stream
  <i
    id="loadingIcon"
    class="fa-solid fa-spinner fa-spin"
    style="display: none;"
  ></i>
</button>

Custom JavaScript

The following JavaScript code adds the refreshData function we reference in the onclick method of our new button in the HTML template above. When the button is clicked, we use the window.prodigy.event function to call into the event handler defined in our recipe under the refresh_model event name.

function refreshData() {
  document.querySelector('#loadingIcon').style.display = 'inline-block'
  event_data = {
    new_model_name: 'textcat_mocker_sad_nlp',
    example: window.prodigy.content,
  }
  window.prodigy
    .event('refresh_model', event_data)
    .then(updated_example => {
      console.log('Updating Current Example with new data:', updated_example)
      window.prodigy.update(updated_example)
      document.querySelector('#loadingIcon').style.display = 'none'
    })
    .catch(err => {
      console.error('Error in Event Handler:', err)
    })
}

And we’ll update the Recipe to use this custom JavaScript.

...

CWD = Path(__file__).parent

@prodigy.recipe(
    "sentiment-refresh",
    dataset=("The dataset to save to", "positional", None, str),
    spacy_model=("Model to load", "positional", None, str),
    source=("Path to texts", "positional", None, Path),
)
def sentiment(dataset: str, spacy_model: str, source: Path) -> ControllerComponentsDict:
    """Annotate the sentiment of texts using different mood options."""
    nlp = load_model(spacy_model)            # load in the mock TextCat model
    stream = get_stream(source, rehash=True) # load in the JSONL file
    stream.apply(add_options)                # add options to each task
    stream.apply(add_predictions, nlp=nlp)   # add predictions to each task

    return {
        "dataset": dataset,
        "stream": stream,
        "view_id": "blocks",
        "config": {
            "blocks": [
                {"view_id": "choice"},
                # Load custom Input for New Model Name
                {"view_id": "html", "html_template": (CWD / "sentiment-refresh.html").read_text()}
            ],
            # Load custom JavaScript file.
            "javascript": (CWD / "sentiment-refresh.js").read_text(),
        },
        "event_hooks": {
            # add event handler endpoint at: `/event/refresh_model`
            "refresh_model": refresh_model_event_handler
        }
    }
...

Note that window.prodigy.event accepts an event name as the first argument, and an Object with keys that exactly match the expected keyword-only arguments to the Event Hook’s handler function we defined.

Event Calling

event_data = {
  new_model_name: 'textcat_mocker_sad_nlp',
  example: window.prodigy.content,
}
window.prodigy.event('refresh_model', event_data)

Event Calling

    ...
    "event_hooks": {
            # event_hooks is a Dict mapping the registered event name,
            # to a valid event handler.
            "refresh_model": refresh_model_event_handler
        }
    }


def refresh_model_event_handler(
    controller: Controller, *, example: dict, new_model_name: str
) -> dict:
    ...

If we run this example now, and click the Model Refresh button, we’ll see our predictions update in real time.

Example: Checking a single example against GPT-4

In this example, we a slightly extended version of the built-in Prodigy recipe ner.llm.correct. The recipe uses a spacy-llm config to do zero-shot NER prediction for text related to food. We’re looking to extract the following set of labels.

  • DISH: The names of food dishes, e.g. Lobster Ravioli, garlic bread
  • INGREDIENT: Individual parts of a food dish, including herbs and spices
  • EQUIPMENT: Any kind of cooking equipment. e.g. oven, cooking pot, grill

To conserve our limited project funding, we’re using the older, cheaper text-davinci-002 model from OpenAI in our config.

This will get us a good start on a lot of our annotations, but as an example of the kind of interactivity you can build, we’ll develop an interface and a custom event that allows us to consult a newer, more capable model (e.g. GPT-3.5 or GPT-4) for a single annotation. For instance, we might want to check with GPT-4 on a really difficult example.

For this project we’ll utilize the blocks feature to develop a custom interface with a button we can press to send a custom event and check what GPT-4 has to say about the current example.

The interface we’ll be building looks like this.

custom event llm model check

If we press this button, we’ll be calling a custom event that will in turn call GPT-4 and update our example in place with the NER predictions it sends back.

For the full project, check out the Prodigy Recipes Repo.