concepts.benchmark.clevr.dataset.CLEVRCustomTransferDataset#

class CLEVRCustomTransferDataset[source]#

Bases: FilterableDatasetUnwrapped

The unwrapped CLEVR dataset for custom transfer learning.

Methods

`get_metainfo`(index)
`make_dataloader`(batch_size, shuffle, ...)	Make a dataloader for this dataset view.

__add__(other)#

__getitem__(index)[source]#

Get a sample from the dataset.

Returns:

scene: the scene annotations (raw dict).
objects: the bounding boxes of the objects (a Tensor of shape [N, 4]).
image_index: the index of the image (int).
image_filename: the filename of the image (str).
image: the image (a Tensor of shape [3, H, W]).
question_index: the index of the question (int).
question_raw: the raw question (str).
question_type: the type of the question (str).
answer: the answer to the question (bool, int, or str).
attribute_{attr_name}: the attribute concept id for each object (a Tensor of shape [N]).
attribute_relation_{attr_name}: the attribute relation concept id for each pair of objects (a Tensor of shape [N, N], then flattened to [N * N]).
relation_{attr_name}: the relational concept id for each pair of objects (a Tensor of shape [N, N, NR], then flattened to [N * N * NR]).

Return type:

a dict of annotations, including

Parameters:

index (int)

__init__(scenes_json, questions_json, image_root, image_transform, query_list_key, custom_fields, output_vocab_json=None, incl_scene=True, incl_raw_scene=False)[source]#

Initialize the CLEVR custom transfer dataset.

Parameters:

scenes_json (str) – the path to the scenes json file.
questions_json (str) – the path to the questions json file.
image_root (str) – the root directory of the images.
image_transform (Callable) – the image transform (torchvision transform).
query_list_key (str) – the key of the query list in the questions json file (e.g., ‘questions’ or ‘questions_human’).
custom_fields (Sequence[str]) – the custom fields to be included in the dataset. These are fields in the scene annotations.
output_vocab_json (str | None) – the path to the output vocab json file. If None, the output vocab will be built from the dataset.
incl_scene (bool) – whether to include the scene annotations (e.g., objects, relationships, etc.).
incl_raw_scene (bool) – whether to include the raw scene annotations.

__iter__()#

__len__()[source]#

__new__(**kwargs)#

get_metainfo(index)#

make_dataloader(batch_size, shuffle, drop_last, nr_workers)[source]#

Make a dataloader for this dataset view.

Parameters:

batch_size (int) – the batch size.
shuffle (bool) – whether to shuffle the dataset.
drop_last (bool) – whether to drop the remaining samples that are smaller than the batch size.
nr_workers (int) – the number of workers for the dataloader.

Returns:

a JacDataLoader instance.

Return type:

JacDataLoader