TextCollectionAttachment

An array of TextCollectionAttachment objects to be labeled.

Video Support

The video attachment should have content that is a link. Supported media types are listed on the MDN Web Docs.

HTML Support in TextCollection Attachments:

When creating a task in TextCollection, customers are able to pass Markdown as the string content. Markdown also allows the use of HTML tags within the Markdown syntax.

However, to ensure the security of the TextCollection platform, we sanitize all HTML tags passed within the Markdown syntax using the HTML-sanitize JavaScript package. This package removes all tags except for the specific set of allowed HTML tags mentioned on the table to the right.

By allowing only these specific HTML tags to be passed through the string, we ensure that the content displayed to the tasker is secure and adheres to our standards. Any HTML tags that are not included in the list of allowed tags will be removed from the string during the sanitization process.

By sanitizing the HTML tags, we prevent any potential security risks that could arise from the use of unauthorized HTML tags, and maintain a high level of security on our platform.

Parameter	Type	Description
type*	string	One of `pdf`, `image`, `text`, `video`, `website`, or `audio`.
content*	string	Content or link to relevant file.
forms	array	Array of `field_id` strings from `FormField`. If this value is set, only show the corresponding attachment if one of the referenced form fields is active.

HTML tags allowed:

Content sectioning	'address', 'article', 'aside', 'footer', 'header','h1', 'h2', 'h3', 'h4', 'h5', 'h6', 'hgroup', 'main', 'nav', 'section'.
Text content	'blockquote', 'dd', 'div', 'dl', 'dt', 'figcaption', 'figure', 'hr', 'li', 'main', 'ol', 'p', 'pre', 'ul',
Inline text semantics	'a', 'abbr', 'b', 'bdi', 'bdo', 'br', 'cite', 'code', 'data', 'dfn', 'em', 'i', 'kbd', 'mark', 'q', 'rb', 'rp', 'rt', 'rtc', 'ruby', 's', 'samp', 'small', 'span', 'strong', 'sub', 'sup', 'time', 'u', 'var'
Table content	'caption', 'col', 'colgroup', 'table', 'tbody', 'td', 'tfoot', 'th', 'thead', 'tr'
Additional Tags	'img', 'iframe'

UnitField

UnitField objects define simple components for data collection.

Beta: Conditional Fields

Sometimes a field should only be presented if specific choices are selected for other fields. In these cases, you can specify the conditions — the dependent questions and corresponding sets of choices.

The conditions property should have the following structure: an array of objects, which define one set of conditions allowing the field to be shown. The operators AND ({ }), OR ([ ]), and NOT (not) are supported, so you could specify an arbitrary set of fields and choices. Each set may contain objects or arrays with the following:

Key: the field_id of the dependent field
Value: an object specifying the desired choices for the dependent field.

For example conditions, please check out the code on the right.

Conditions currently only work with dependent fields of type CategoryField. It is valid syntax on other fields, but may raise errors or undefined behavior.

Parameter	Type	Default	Description
type*	string		One of `text`, `boolean`, `number`, `datetime`, or `category`, `select`, `time_range`.
field_id*	string		A unique identifier for the field, which should not change among tasks within a project.
title*	string		Field title to be displayed to taskers. This should be short and singular. This may change among tasks within a project. Must not be an empty string.
description	string	undefined	A brief description about what the response should be. This may change among tasks within a project.
hint	string	undefined	Longer explanation of why the field exists and how it should be used. Renders as a tooltip.
required	boolean	false	Determines whether or not a response for this field is required.
min_responses_required	integer	1	The minimum number of separate annotations allowed for this field. Must be larger than 0.
max_responses_required	integer	1	The maximum number of separate annotations allowed for this field. Must be larger than or equal to `min_responses_required`, with an upper bound of 100.
conditions	array_object	undefined	A set of conditions which must be satisfied for this field to be shown.
Additional Fields			See the TextField, BooleanField, NumberField, DatetimeField, and CategoryField sections.

Example

// Example of UnitField with conditions
{
  type: "category",
  field_id: "occlusion",
  title: "Is there occlusion in the image?",
  choices: [{label: 'None', value: '0' },
            {label: 'A little', value: '1'},
            {label: 'A lot', value: '2'}],
  conditions: [{}],
},
{
  type: "category",
  field_id: "occlusion_detail",
  title: "What is the cause of the occlusion?",
  choices: [{label: 'Rain', value: 'rain'},
            {label: 'Shadow', value: 'shadow'}],
  conditions: [{
    occlusion: ['1', '2'], // show if 1 or 2 are selected
    // equivalently {not: [[], ['0']}
    // equivalently [{not: []}, {not: ['0']}]
    // equivalently [['1'],['2']]
  }],
},
{
  type: "text",
  field_id: "a_lot_of_shadow",
  title: "Please describe why there is so much shadow.",
  conditions: [{
    // show if 2 and shadow are selected in their respective fields
    occlusion: ['2'], 
    occlusion_detail: ['shadow'],
  }],
},

TextField

Subclass of UnitField and returns a string response.

Example

{
  "type": "text",
  "field_id": "summary",
  "title": "Summary",
  "min_responses_required": 1,
  "max_responses_required": 3,
  "max_characters": 500,
  "required": true
}

Parameter	Type	Default	Description
max_characters	integer	`undefined`	The maximum number of characters allowed in the field.

BooleanField

Subclass of UnitField and returns a boolean response. Has no additional parameters.

NumberField

Subclass of UnitField and returns a string response based on the annotated number.

Example

{
  "type": "number",
  "field_id": "item_price",
  "title": "Item Price",
  "description": "Leave empty if not applicable.",
  "required": false,
  "use_slider": true,
  "min": 0,
  "max": 100
}

Parameter	Type	Default	Description
use_slider	boolean	`false`	Set to `true` to use a slider instead of textbox.
min	float	`undefined`	Sets the minimum value of the slider.
max	float	`undefined`	Sets the maximum value of the slider.
step	float	`undefined`	Sets the step value of the slider.

DatetimeField

Subclass of UnitField and returns a DatetimeAnnotation response.

Definition: `DatetimeSpec`

An enum that consists of year, month, day, hour, and minute.

Definition: `DatetimeAnnotation`

An interface that contains optional number fields including year, month, day, hour, and minute.

Example

{
  "type": "datetime",
  "field_id": "release_date",
  "title": "Date of Product Release",
  "description": "Leave empty if not applicable.",
  "include": ["year", "month", "day"],
  "defaults": {
    "year": 2021,
    "month": 4,
    "day": 13
  }
}

Parameter	Parameter	Default	Description
include*	array		An array of `DatetimeSpec` elements. Must contain at least one element.
defaults	DatetimeAnnotation	`{}`	Default value for the return value.

CategoryField

Subclass of UnitField and returns an array of selected CategoryChoiceValue elements in its response.

CategoryChoice elements with subchoices are only used for navigation. The only selectable CategoryChoice elements are those with no subchoices.

Example

{
  "type": "category",
  "field_id": "genre",
  "title": "Select all genres that apply.",
  "choices": [
    {
      "label": "Hip-Hop/Rap",
      "value": "hip-hop-rap",
      "hint":
        "It consists of a stylized rhythmic music that commonly accompanies rapping, a rhythmic and rhyming speech that is chanted.",
      "subchoices": [
        { "label": "Dirty South", "value": "dirty-south" },
        { "label": "Industrial Hip Hop", "value": "industrial-hip-hop" },
        { "label": "Nerdcore", "value": "nerdcore" },
        { "label": "Rap", "value": "rap" },
      ]
    },
        {
      "label": "R&B/Soul",
      "value": "rb-soul",
      "subchoices": [
        { "label": "Disco", "value": "disco" },
        { "label": "Funk", "value": "funk" },
        { "label": "Motown", "value": "motown" },
      ]
        },
  ],
  "min_choices": 1,
  "max_choices": 5
}

Parameter	Type	Default	Description
choices*	array		An array of `CategoryChoice` elements to define the relevant choice.
min_choices	integer	`1`	Minimum number of choices to select.
max_choices	interer	`1`	Maximum number of choices to select. If this value is greater than 1, the form renders a checkbox. Otherwise, it renders a radio button.

CategoryChoice

Parameter	Type	Default	Description
label*	string		The label of the choice field. This description may change among tasks within a project.
value*	CategoryChoiceValue		The value of the choice field. Must be a `string`, `number`, or `boolean`.
hint	string	undefined	An array of `CategoryChoice` elements to define the relevant subchoices.

TimerangeField

Subclass of UnitField.

Example

{
  "type": "time_range",
  "field_id": "hours",
  "title": "Store Hours",
  "defaults_seconds": [
    28800,
    72000
  ],
  "increment_seconds": 300,
  "max_responses_required": 2, 
  "min_responses_required": 0
}

Parameter	Type	Description
default_seconds*	array	Must have length 2, and be in range [0, 24 * 60 * 60]
increment_seconds	number	Must be between 1 and 60 * 60
default_from_field	string	Must be a valid field_id

SelectField

Subclass of UnitField.

Example

{
  "type": "select",
	"field_id": "sentiment",
  "title": "Sentiment",
  "description": "Choose a sentiment that best describes this text",
  "required": True,
  "choices_from_field": "Options",
}

Parameter	Type	Default	Description
choices	array		An array of selectable options, `choices` is not required if `choices_from_field` is present.
choices_from_field	string		Must be a valid field_id

RankingField

RankingField objects allow you to define task to rank task attachments.

Returns a list response with ordered options.

Example

{
	"type": "ranking_order",
  "field_id": "relevance_ranking",
  "title": "Rank titles based on their relevance to the article",
  "hint": "From the most relevant to the least one",
  "first_label": "Best",
  "last_label": "Worst",
  "num_items_to_rank": 3
}

Parameter	Type	Default	Description
title	string	`undefined`	A brief description about what the response should be. This may change among tasks within a project.
hint	string	`undefined`	An array of child `UnitField` and `FieldSet` objects. Must contain at least 2 elements.
first_label	string	`undefined`	Determines whether or not all .
last_label	string	`undefined`
num_items_to_rank	integer	`3`	The number of options required to rank (can be less than number of attachments).
required	booleanfalse	`false`	Determines whether or not all `num_items_to_rank` fields should filled.

FormField

FormField objects allow you to create several mini-forms associated with different attachments. These mini-forms will be populated by the object's child fields.

Returns a dictionary response with key-value pairs defined by its child fields.

📘Note
FormField objects can only be located on the top level of the fields task parameter. If one FormField object is used, all the other top-level objects must also be FormField objects.

Example

{
  "type": "form",
  "field_id": "form_query",
  "title": "Query Intention",
  "fields": [
    {
      "type": "text",
      "field_id": "query_intention",
      "title": "Query Intention",
      "hint": "Please investigate the search links."
    },
  ]
}

Parameter	Type	Default	Description
type*	string		For `FormField` Objects, this should be set to `form`
field_id*	string		A unique identifier for the field, which should not change among tasks within a project
title*	string		Field title to be displayed to taskers. This should be short and singular. This may change among tasks within a project.
description	string	`undefined`	A brief description about what the response should be. This may change among tasks within a project.
fields*	array		An array of child UnitField and FieldSet objects. Any FieldSet objects here must have incline set to true

Text Collection Callback Format

The response object, which is part of the callback POST request and permanently stored as part of the task object, will have an annotations field. The annotations object is a dictionary in which each key is a field_id defined in the task parameters and each value is the respective annotation for that field.

Each annotation will be of the type defined by its field above. If max_responses_required is applicable and greater than 1, the annotation will be an array of the type.

📘
See the Callback section for more details about callbacks.

Example

{
  "response": {
    "annotations": {
      "category_name": "Soup", //TextField
      "category_items": [ //FieldSet with max_responses_required greater than one
        {
          "item_name": "Tom Yum Chicken Soup", //TextField
          "item_price": "11.79" //NumberField
        },
        {
          "item_name": "Tom Yum Beef Soup", //TextField
          "item_price": "11.79" //NumberField
        }
      ],
      "category_metadata": { //FieldSet
        "gluten_friendly": true, //BooleanField
        "labels": [ //TextField with max_responses_required greater than one
          "Free Range", 
          "All Natural"
        ] 
      }
    }
  },
  "task_id": "5774cc78b01249ab09f089dd",
  "task": {
    // populated task for convenience
  }
}

Text Collection Hypothesis

When creating a textcollection task, you can provide prelabels in the hypothesis field, so that workers don't have to start from scratch to annotate the image.

In order to add pre-labels in a task using hypothesis, you’ll need to provide these in the hypothesis field of the payload when creating the task. The schema of the hypothesis object must match the schema of the task response.

Verify the task response field schema for the desired task type.
Review your project taxonomy (label names, attribute conditions, annotation types, etc).
Generate pre-labels that are formatted to match the aforementioned schema and taxonomy.
Create a task, including a hypothesis field that contains the pre-labels at the same top-level as other task fields such as project and instructions.

The hypothesis format will largely mirror Scale’s task response format. In this particular task type, annotations field is mandatory inside the hypothesis object.

The only difference between hypothesis and the response format is that inside every field you want to pre-annotate, you'll need to add two more field fields:

type describes the field type (category, select, text, etc.)
field_id describes the identification given to this field for tracking (field name)

You can find these two fields in your task taxonomy

Note: For Text types fields the response format differs from the other types. For this particular field type, response field will be an array of a single string instead of an array of arrays containing strings.

task_payload_with_hypothesis

{
 ...
 "batch": "regular_batch_name",
 "hypothesis": {
   "annotations": {
     "(EXAMPLE) Multiple Choice Question": {
       "type": "category",
       "field_id": "(EXAMPLE) Multiple Choice Question",
       "response": [
         [
           "B"
         ]
       ]
     }
   }
 },
 ...
}

task_taxonomy

{
   "fields": [
     {
       "type": "category",
       "field_id": "(EXAMPLE) Multiple Choice Question",
       "title": "Which option best fits this task?",
       "choices": [
         {
           "label": "A",
           "value": "A"
         },
         {
           "label": "B",
           "value": "B"
         },
         {
           "label": "C",
           "value": "C"
         }
       ],
       "min_choices": 1,
       "max_choices": 1,
       "description": "Select one of the following. "
     }
   ]
 }

task_payload_with_hypothesis_text_field

{
   ...
   "hypothesis": {
       "annotations": {
           "Product Description": {
               "type": "text",
               "field_id": "(EXAMPLE) Text Input Field",
               "response": [
                   "Dolore in dolor occaecat deserunt ex in qui non amet est."
               ]
           }
       }
   }
   ...
}

NamedEntityRecognitionLabel

NamedEntityRecognitionLabel objects define the taxonomy of labels to use to annotate spans of text.

NamedEntityRecognitionAttribute objects define form fields for individual annotations.

AttributeSelectOption objects define possible values for select attributes.

NamedEntityRecognitionLabel

Parameter	Type	Default	Description
name*	string		A unique identifier for this label.
display_name	string	`name`	An alias for this label to display to taskers.
description	string	`undefined`	A description of what this label should represent. Displayed to taskers to improve quality.
children	array_object	`undefined`	An array of `NamedEntityRecognitionLabel` objects to group underneath this label. Specifying this field causes this label itself to no longer be used for labeling text spans.
attributes (optional)	object	`undefined`

NamedEntityRecognitionAttribute

Parameter	Type	Description
type	string	Only 'select' for now.
options	array_object	List of select option objects.
display_name	string	Optional display name.
description	string	Optional description.

AttributeSelectOption

Parameter	Type	Description
value	string	The value that will show up in the response if this option is selected.
display_name	string	Optional display name if different from the value.

NamedEntityRecognitionRelationshipDefinition

NamedEntityRecognitionRelationshipDefinition objects specify the types of relationship that can exist between two text spans.

A relationship can either be named or unnamed. A named relationship is useful if you need to distinguish between multiple types of relationship that could exist between the same two text spans. For instance, if you're annotating a description of someone's family history, you might want to distinguish a "child of" relationship from a "sibling of" relationship.

A task can only specify one type of relationship. Either all the relationships in a task must be named, or all must be unnamed.

Parameter	Type	Default	Description
name	string		A unique identifier for this type of relationship. Required for named relationships; disallowed for unnamed relationships.
display_name	string		A description for this relationship to display to taskers. Should be able to be used to construct a short phrase describing the relationship. For example, a relationship between two text spans "A" and "B" with `display_name` "is parent of" would be rendered to taskers as "A is parent of B". Required for named relationships; disallowed for unnamed relationships.
is_directed	boolean	false	A field indicating whether the directionality of this relationship matters. For example, a "is parent of" relationship would likely be directed, whereas a "is sibling of" relationship would likely not be directed. Optional for named relationships; disallowed for unnamed relationships.
source_label	string		A string referencing the `name` field of a `NamedEntityRecognitionLabel` object. If set, mandates that the source text span of this field must be labeled with the corresponding `NamedEntityRecognitionLabel`, or one of its `children`. Optional for both named and unnamed relationships.
target_label	string		A string referencing the `name` field of a `NamedEntityRecognitionLabel` object. If set, mandates that the target text span of this field must be labeled with the corresponding `NamedEntityRecognitionLabel`, or one of its `children`. Optional for both named and unnamed relationships.

Named Entity Recognition Callback Format

The response object is part of the callback POST request and is permanently stored as part of the task object.

NamedEntityRecognitionResponse

The structure of a response object for named entity recognition consists of two arrays: one for entity annotations and another for relationships between these entities.

NamedEntityRecognitionAnnotation

The format for an individual entity annotation within the named entity recognition response, detailing the unique identifier, position, and content of the recognized text span, as well as its label and any optional attributes.

NamedEntityRecognitionRelationship

In tasks with undirected relationships, the source_ref and target_ref fields are interchangeable. In tasks with links that do not have relationship names, the name field will be left blank.

Example

{
  "annotations": [
    {
      "id": "b86c22a3-1f7c-4be2-bb8f-899ee9324c0b",
      "start": 10,
      "end": 17,
      "text": "Alex Wang",
      "label": "person",
    },
    {
      "id": "a76da53e-4ebd-4466-aed7-80db6fb98329",
      "start": 22,
      "end": 31,
      "text": "Transform",
      "label": "conference",
    }
  ],
  "relationships": [
    {
      "id": "ade8e9e9-ef9c-4fc7-9517-62d79a15c1cb",
      "source_ref": "b86c22a3-1f7c-4be2-bb8f-899ee9324c0b",
      "target_ref": "a76da53e-4ebd-4466-aed7-80db6fb98329",
      "name": "speaker_at",
    }
  ]
}

NamedEntityRecognitionResponse

Field	Type	Description
annotations	object array	List of `NamedEntityRecogntionAnnotation` objects.
relationships	object array	List of `NamedEntityRecognitionRelationship` objects.

NamedEntityRecognitionAnnotation

Field	Type	Description
id	string	Unique identifier.
start	number	Start index of the text span.
end	number	End index of the text span.
text	string	Text of the text span.
label	string	References the `name` field of a label in the task params.
attributes (optional)	object	The keys of the object reference keys of the `attributes` object for the corresponding label in the task params.

NamedEntityRecognitionRelationship

Field	Type	Description
id	string	Unique identifier.
source_ref	string	References the `id` of the annotation that is the source of the directed relationship.
target_ref	string	References the `id` of the annotation that is the target of the directed relationship.
name (optional)	string	References the `name` of relationship definitions in the task params.

TextCollectionAttachment

Video Support

HTML Support in TextCollection Attachments:

UnitField

Beta: Conditional Fields

TextField

BooleanField

NumberField

DatetimeField

Definition: DatetimeSpec

Definition: DatetimeAnnotation

CategoryField

TimerangeField

SelectField

RankingField

FormField

📘Note

Text Collection Callback Format

📘

Text Collection Hypothesis

NamedEntityRecognitionLabel

NamedEntityRecognitionRelationshipDefinition

Named Entity Recognition Callback Format

Definition: `DatetimeSpec`

Definition: `DatetimeAnnotation`