MultiModalPredictor.evaluate

MultiModalPredictor.evaluate(data: DataFrame | dict | list | str, query_data: list | None = None, response_data: list | None = None, id_mappings: Dict[str, Dict] | Dict[str, Series] | None = None, metrics: str | List[str] | None = None, chunk_size: int | None = 1024, similarity_type: str | None = 'cosine', cutoffs: List[int] | None = [1, 5, 10], label: str | None = None, return_pred: bool | None = False, realtime: bool | None = False, eval_tool: str | None = None, predictions: List[ndarray] | None = None, labels: ndarray | None = None)[source]

Evaluate the model on a given dataset.

Parameters:
  • data – A pd.DataFrame, containing the same columns as the training data. Or a str, that is a path of the annotation file for detection.

  • query_data – Query data used for ranking.

  • response_data – Response data used for ranking.

  • id_mappings – Id-to-content mappings. The contents can be text, image, etc. This is used when data/query_data/response_data contain the query/response identifiers instead of their contents.

  • metrics – A list of metric names to report. If None, we only return the score for the stored _eval_metric_name.

  • chunk_size – Scan the response data by chunk_size each time. Increasing the value increases the speed, but requires more memory.

  • similarity_type – Use what function (cosine/dot_prod) to score the similarity (default: cosine).

  • cutoffs – A list of cutoff values to evaluate ranking.

  • label – The label column name in data. Some tasks, e.g., image<–>text matching, have no label column in training data, but the label column may be still required in evaluation.

  • return_pred – Whether to return the prediction result of each row.

  • realtime – Whether to do realtime inference, which is efficient for small data (default False). If provided None, we would infer it on based on the data modalities and sample number.

  • eval_tool – The eval_tool for object detection. Could be “pycocotools” or “torchmetrics”.

Returns:

  • A dictionary with the metric names and their corresponding scores.

  • Optionally return a pd.DataFrame of prediction results.