autogluon.timeseries.TimeSeriesDataFrame#
- class autogluon.timeseries.TimeSeriesDataFrame(data: Any, static_features: Optional[DataFrame] = None, *args, **kwargs)[source]#
TimeSeriesDataFrames represent a collection of time series, where each row identifies the values of an (item_id,timestamp) pair.For example, a time series data frame could represent the daily sales of a collection of products, where each
item_ididentifies a product andtimestamps correspond to the days.- Parameters
data (Any) ā
Time-series data to construct a
TimeSeriesDataFrame. The class currently supports four input formats.Time-series data in a pandas DataFrame format without multi-index. For example:
item_id timestamp target 0 0 2019-01-01 0 1 0 2019-01-02 1 2 0 2019-01-03 2 3 1 2019-01-01 3 4 1 2019-01-02 4 5 1 2019-01-03 5 6 2 2019-01-01 6 7 2 2019-01-02 7 8 2 2019-01-03 8
Time-series data in pandas DataFrame format with multi-index on item_id and timestamp. For example:
target item_id timestamp 0 2019-01-01 0 2019-01-02 1 2019-01-03 2 1 2019-01-01 3 2019-01-02 4 2019-01-03 5 2 2019-01-01 6 2019-01-02 7 2019-01-03 8
Path to a data file in CSV or Parquet format. The file must contain columns
item_idandtimestamp, as well as columns with time series values. This is similar to Option 1 above (pandas DataFrame format without multi-index). Both remote (e.g., S3) and local paths are accepted.Time-series data in Iterable format. For example:
iterable_dataset = [ {"target": [0, 1, 2], "start": pd.Timestamp("01-01-2019", freq='D')}, {"target": [3, 4, 5], "start": pd.Timestamp("01-01-2019", freq='D')}, {"target": [6, 7, 8], "start": pd.Timestamp("01-01-2019", freq='D')} ]
static_features (Optional[pd.DataFrame]) ā
An optional data frame describing the metadata attributes of individual items in the item index. These may be categorical or real valued attributes for each item. For example, if the item index refers to time series data of individual households, static features may refer to time-independent demographic features. When provided during
fit, theTimeSeriesPredictorexpects the same metadata to be available during prediction time. When provided, the index of thestatic_featuresindex must match the item index of theTimeSeriesDataFrame.TimeSeriesDataFramewill ensure consistency of static features during serialization/deserialization, copy and slice operations although these features should be considered experimental.
- freq#
A pandas and gluon-ts compatible string describing the frequency of the time series. For example āDā is daily data, etc. Also see, https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#offset-aliases
- Type
str
- num_items#
Number of items (time series) in the data set.
- Type
int
- item_ids#
List of unique time series IDs contained in the data set.
- Type
pd.Index
Methods
Make a copy of this object's indices and data.
Drop rows containing NaNs.
Fill missing values represented by NaN.
Construct a
TimeSeriesDataFramefrom a pandas DataFrame.Construct a
TimeSeriesDataFramefrom an Iterable of dictionaries each of which represent a single time series.Construct a
TimeSeriesDataFramefrom a CSV or Parquet file.Convenience method to read pickled time series data frames.
Returns a new TimeSeriesDataFrame object with the same underlying data and static features as the current data frame, except the time index is replaced by a new "dummy" time series index with the given frequency.
Length of each time series in the dataframe.
Select a subsequence from each time series between start (inclusive) and end (exclusive) timestamps.
Select a subsequence from each time series between start (inclusive) and end (exclusive) indices.
Split dataframe to two different
TimeSeriesDataFrames before and after a certaincutoff_time.Fill the gaps in an irregularly-sampled time series with NaNs.
Attributes