Task I/O Targets¶

How is task data saved and loaded?¶

Task data is saved in a file, database table or memory (cache). You can control how task output data is saved by chosing the right parent class for a task. In the example below, data is saved as parquet and loaded as a pandas dataframe because the parent class is TaskPqPandas. The python object you want to save determines how you can save the data.

class YourTask(d6tflow.tasks.TaskPqPandas):

Task Output Location¶

By default file-based task output is saved in data/. You can customize where task output is saved.

d6tflow.set_dir('../data')

Core task targets (Pandas)¶

What kind of object you want to save determines which Task class you need to use.

pandas
- d6tflow.tasks.TaskPqPandas: save to parquet, load as pandas
- d6tflow.tasks.TaskCachePandas: save to memory, load as pandas
- d6tflow.tasks.TaskCSVPandas: save to CSV, load as pandas
- d6tflow.tasks.TaskExcelPandas: save to Excel, load as pandas
- d6tflow.tasks.TaskSQLPandas: save to SQL, load as pandas (premium, see below)
dicts
- d6tflow.tasks.TaskJson: save to JSON, load as python dict
- d6tflow.tasks.TaskPickle: save to pickle, load as python list
- NB: don’t save a dict of pandas dataframes as pickle, instead save as multiple outputs, see “save more than one output” in Tasks
any python object (eg trained models)
- d6tflow.tasks.TaskPickle: save to pickle, load as python list
- d6tflow.tasks.TaskCache: save to memory, load as python object
dask, SQL, pyspark: premium features, see below

Premium Targets (Dask, SQL, Pyspark)¶

Database Targets¶

d6tflow premium has database targets, request access at https://pipe.databolt.tech/gui/request-premium/

Dask Targets¶

d6tflow premium has dask targets, request access at https://pipe.databolt.tech/gui/request-premium/

Pyspark Targets¶

d6tflow premium has pyspark targets, request access at https://pipe.databolt.tech/gui/request-premium/

Community Targets¶

Keras Model Targets¶

For saving Keras model targets

from d6tflow.tasks.h5 import TaskH5Keras

Writing Your Own Targets¶

This is often relatively simple since you mostly need to implement load() and save() functions. For more advanced cases you also have to implement exist() and invalidate() functions. Check the source code for details or raise an issue.