Iterators

Base Iterator

class udao.data.iterators.base_iterator.BaseDatasetIterator(keys: Sequence[str], *args: Type[BaseContainer])

Bases: Dataset

Base class for all dataset iterators. Inherits from torch.utils.data.Dataset.

static collate(items: List[Any]) Any

Collates the items into a batch. Used in the dataloader.

get_dataloader(batch_size: int, shuffle: bool = False, num_workers: int = 0, **kwargs: Any) DataLoader

Returns a torch dataloader for the iterator, that can be used for training. This will use the collate static method to collate the items into a batch.

classmethod get_parameter_names() List[str]

Returns the names of the container parameters of the iterator. Useful to create dynamic parameters for related parts of the pipeline (feature extractors, preprocessors)

Query Plan Iterator

class udao.data.iterators.query_plan_iterator.QueryPlanIterator(keys: Sequence[str], tabular_features: TabularContainer, objectives: TabularContainer, query_structure: QueryStructureContainer, **kwargs: TabularContainer)

Bases: BaseDatasetIterator

Iterator that returns a dgl.DGLGraph for each key, with associated node features. The features are stored in the graph.ndata dictionary. The features are expected to be float tensors, and to be of the same length as the number of nodes in the graph.

Parameters:
  • keys (Sequence[str]) – Keys of the dataset, used for accessing all features

  • tabular_features (TabularContainer) – Container for the tabular features associated with the plan

  • objectives (TabularContainer) – Container for the objectives associated with the plan

  • query_structure (QueryStructureContainer) – Wrapper around the graph structure and the features for each query plan

  • kwargs (BaseContainer) – Variable number of other features to add to the graph, e.g. embeddings

class FeatureItem(graph: DGLGraph, features: Tensor, objectives: Tensor)

Bases: object

Named tuple for the features of a query plan.

static collate(items: List[FeatureItem]) Tuple[DGLGraph, Tensor, Tensor]

Collate a list of FeatureItem into a single graph.

Tabular Iterator

class udao.data.iterators.tabular_iterator.TabularIterator(keys: Sequence[str], tabular_feature: TabularContainer)

Bases: BaseDatasetIterator

Iterator on tabular data.

Parameters:
  • keys (Sequence[str]) – Keys of the dataset, used for accessing all features

  • table (TabularContainer) – Container for the tabular data