autogluon.timeseries.TimeSeriesDataFrame#
- class autogluon.timeseries.TimeSeriesDataFrame(data: Any, static_features: Optional[DataFrame] = None, *args, **kwargs)[source]#
TimeSeriesDataFrame
s represent a collection of time series, where each row identifies the values of an (item_id
,timestamp
) pair.For example, a time series data frame could represent the daily sales of a collection of products, where each
item_id
identifies a product andtimestamp
s correspond to the days.- Parameters
data (Any) ā
Time-series data to construct a
TimeSeriesDataFrame
. The class currently supports four input formats.Time-series data in a pandas DataFrame format without multi-index. For example:
item_id timestamp target 0 0 2019-01-01 0 1 0 2019-01-02 1 2 0 2019-01-03 2 3 1 2019-01-01 3 4 1 2019-01-02 4 5 1 2019-01-03 5 6 2 2019-01-01 6 7 2 2019-01-02 7 8 2 2019-01-03 8
Time-series data in pandas DataFrame format with multi-index on item_id and timestamp. For example:
target item_id timestamp 0 2019-01-01 0 2019-01-02 1 2019-01-03 2 1 2019-01-01 3 2019-01-02 4 2019-01-03 5 2 2019-01-01 6 2019-01-02 7 2019-01-03 8
Path to a data file in CSV or Parquet format. The file must contain columns
item_id
andtimestamp
, as well as columns with time series values. This is similar to Option 1 above (pandas DataFrame format without multi-index). Both remote (e.g., S3) and local paths are accepted.Time-series data in Iterable format. For example:
iterable_dataset = [ {"target": [0, 1, 2], "start": pd.Timestamp("01-01-2019", freq='D')}, {"target": [3, 4, 5], "start": pd.Timestamp("01-01-2019", freq='D')}, {"target": [6, 7, 8], "start": pd.Timestamp("01-01-2019", freq='D')} ]
static_features (Optional[pd.DataFrame]) ā
An optional data frame describing the metadata attributes of individual items in the item index. These may be categorical or real valued attributes for each item. For example, if the item index refers to time series data of individual households, static features may refer to time-independent demographic features. When provided during
fit
, theTimeSeriesPredictor
expects the same metadata to be available during prediction time. When provided, the index of thestatic_features
index must match the item index of theTimeSeriesDataFrame
.TimeSeriesDataFrame
will ensure consistency of static features during serialization/deserialization, copy and slice operations although these features should be considered experimental.
- freq#
A pandas and gluon-ts compatible string describing the frequency of the time series. For example āDā is daily data, etc. Also see, https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#offset-aliases
- Type
str
- num_items#
Number of items (time series) in the data set.
- Type
int
- item_ids#
List of unique time series IDs contained in the data set.
- Type
pd.Index
Methods
Make a copy of this object's indices and data.
Drop rows containing NaNs.
Fill missing values represented by NaN.
Construct a
TimeSeriesDataFrame
from a pandas DataFrame.Construct a
TimeSeriesDataFrame
from an Iterable of dictionaries each of which represent a single time series.Construct a
TimeSeriesDataFrame
from a CSV or Parquet file.Convenience method to read pickled time series data frames.
Returns a new TimeSeriesDataFrame object with the same underlying data and static features as the current data frame, except the time index is replaced by a new "dummy" time series index with the given frequency.
Length of each time series in the dataframe.
Select a subsequence from each time series between start (inclusive) and end (exclusive) timestamps.
Select a subsequence from each time series between start (inclusive) and end (exclusive) indices.
Split dataframe to two different
TimeSeriesDataFrame
s before and after a certaincutoff_time
.Fill the gaps in an irregularly-sampled time series with NaNs.
Attributes