airflow_dbt_python.operators

Airflow operators for all dbt commands.

class airflow_dbt_python.operators.dbt.DbtBaseOperator(project_dir=None, profiles_dir=None, profile=None, target=None, state=None, compiled_target=None, cache_selected_only=None, fail_fast=None, single_threaded=None, threads=None, use_experimental_parser=None, vars=None, warn_error=None, debug=None, log_path=None, log_level=None, log_level_file=None, log_format=LogFormat.DEFAULT, log_format_file=LogFormat.DEBUG, log_cache_events=False, quiet=None, no_quiet=None, print=None, no_print=None, record_timing_info=None, defer=None, no_defer=None, partial_parse=False, no_partial_parse=None, introspect=None, no_introspect=None, use_colors=None, no_use_colors=None, static_parser=None, no_static_parser=None, version_check=None, no_version_check=None, write_json=None, write_perf_info=None, send_anonymous_usage_stats=None, no_send_anonymous_usage_stats=None, partial_parse_file_diff=None, no_partial_parse_file_diff=None, inject_ephemeral_ctes=None, no_inject_ephemeral_ctes=None, empty=None, no_empty=None, show_resource_report=None, no_show_resource_report=None, favor_state=None, no_favor_state=None, export_saved_queries=None, no_export_saved_queries=None, dbt_conn_id=None, profiles_conn_id=None, project_conn_id=None, do_xcom_push_artifacts=None, upload_dbt_project=False, delete_before_upload=False, replace_on_upload=False, env_vars=None, **kwargs)[source]

The basic Airflow dbt operator.

Defines how to build an argument list and execute a dbt command. Does not set a command itself, subclasses should set it.

Parameters:
  • project_dir (Optional[Union[str, Path]])

  • profiles_dir (Optional[Union[str, Path]])

  • profile (Optional[str])

  • target (Optional[str])

  • state (Optional[str])

  • compiled_target (Optional[Union[os.PathLike, str, bytes]])

  • cache_selected_only (Optional[bool])

  • fail_fast (Optional[bool])

  • single_threaded (Optional[bool])

  • threads (Optional[int])

  • use_experimental_parser (Optional[bool])

  • vars (Optional[dict[str, str]])

  • warn_error (Optional[bool])

  • debug (Optional[bool])

  • log_path (Optional[str])

  • log_level (Optional[str])

  • log_level_file (Optional[str])

  • log_format (LogFormat)

  • log_format_file (LogFormat)

  • log_cache_events (Optional[bool])

  • quiet (Optional[bool])

  • no_quiet (Optional[bool])

  • print (Optional[bool])

  • no_print (Optional[bool])

  • record_timing_info (Optional[str])

  • defer (Optional[bool])

  • no_defer (Optional[bool])

  • partial_parse (Optional[bool])

  • no_partial_parse (Optional[bool])

  • introspect (Optional[bool])

  • no_introspect (Optional[bool])

  • use_colors (Optional[bool])

  • no_use_colors (Optional[bool])

  • static_parser (Optional[bool])

  • no_static_parser (Optional[bool])

  • version_check (Optional[bool])

  • no_version_check (Optional[bool])

  • write_json (Optional[bool])

  • write_perf_info (Optional[bool])

  • send_anonymous_usage_stats (Optional[bool])

  • no_send_anonymous_usage_stats (Optional[bool])

  • partial_parse_file_diff (Optional[bool])

  • no_partial_parse_file_diff (Optional[bool])

  • inject_ephemeral_ctes (Optional[bool])

  • no_inject_ephemeral_ctes (Optional[bool])

  • empty (Optional[bool])

  • no_empty (Optional[bool])

  • show_resource_report (Optional[bool])

  • no_show_resource_report (Optional[bool])

  • favor_state (Optional[bool])

  • no_favor_state (Optional[bool])

  • export_saved_queries (Optional[bool])

  • no_export_saved_queries (Optional[bool])

  • dbt_conn_id (Optional[str])

  • profiles_conn_id (Optional[str])

  • project_conn_id (Optional[str])

  • do_xcom_push_artifacts (Optional[list[str]])

  • upload_dbt_project (bool)

  • delete_before_upload (bool)

  • replace_on_upload (bool)

  • env_vars (Optional[Dict[str, Any]])

project_dir

Directory for dbt to look for dbt_profile.yml. Defaults to current directory.

profiles_dir

Directory for dbt to look for profiles.yml. Defaults to ~/.dbt.

profile

Which profile to load. Overrides dbt_profile.yml.

target

Which target to load for the given profile.

state

Unless overridden, use this state directory for both state comparison and deferral.

cache_selected_only

At start of run, populate relational cache only for schemas containing selected nodes, or for all schemas of interest.

fail_fast

Stop execution on first failure.

single_threaded

Execute dbt command in single threaded mode. For test only

threads

Specify number of threads to use while executing models. Overrides settings in profiles.yml.

use_experimental_parser

Enable experimental parsing features.

vars

Supply variables to the project. Should be a YAML string. Overrides variables defined in dbt_profile.yml.

warn_error

If dbt would normally warn, instead raise an exception. Examples include –select that selects nothing, deprecations, configurations with no associated models, invalid test configurations, and missing sources/refs in tests.

debug

Display debug logging during dbt execution. Useful for debugging and making bug reports.

log_path

Log path for dbt execution.

log_level

Specify the minimum severity of events that are logged to the console and the log file.

log_level_file

Specify the minimum severity of events that are logged to the log file by overriding the default value

log_format

Specify the format of logging to the console and the log file.

log_format_file

Specify the format of logging to the log file by overriding the default value

log_cache_events

Flag to enable logging of cache events.

quiet/no_quiet

Suppress all non-error logging to stdout. Does not affect {{ print() }} macro calls.

print/no_print

Output all {{ print() }} macro calls.

record_timing_info

When this option is passed, dbt will output low-level timing stats to the specified file.

defer/no_defer

If set, resolve unselected nodes by deferring to the manifest within the –state directory.

partial_parse/no_partial_parse

Allow for partial parsing by looking for and writing to a pickle file in the target directory.

introspect/no_introspect

Whether to scaffold introspective queries as part of compilation

use_colors/no_use_colors

Specify whether log output is colorized in the console and the log file.

static_parser/no_static_parser

Use the static parser.

version_check/no_version_check

If set, ensure the installed dbt version matches the require-dbt-version specified in the dbt_project.yml file (if any). Otherwise, allow them to differ.

write_json/no_write_json

Whether or not to write the manifest.json and run_results.json files to the target directory

send_anonymous_usage_stats/no_send_anonymous_usage_stats

Send anonymous usage stats to dbt Labs.

partial_parse_file_diff/no_partial_parse_file_diff

Internal flag for whether to compute a file diff during partial parsing.

inject_ephemeral_ctes/no_inject_ephemeral_ctes

Internal flag controlling injection of referenced ephemeral models’ CTEs during compile.

empty/no_empty

If specified, limit input refs and sources to zero rows.

show_resource_report/no_show_resource_report

If set, dbt will output resource report into log.

favor_state/no_favor_state

If set, defer to the argument provided to the state flag for resolving unselected nodes, even if the node(s) exist as a database object in the current environment.

export_saved_queries/no_export_saved_queries

Export saved queries within the ‘build’ command, otherwise no-op

dbt_conn_id

An Airflow connection ID to generate dbt profile from.

profiles_conn_id

An Airflow connection ID to use for pulling dbt profiles from remote (e.g. git/s3/gcs).

project_conn_id

An Airflow connection ID to use for pulling dbt project from remote (e.g. git/s3/gcs).

do_xcom_push_artifacts

A list of dbt artifacts to XCom push.

upload_dbt_project

Flag to enable unloading the project dbt after the operator execution back to project_dir.

delete_before_upload

Flag to enable cleaning up project_dir before uploading dbt project back to.

replace_on_upload

Flag to allow replacing files when uploading dbt project back to project_dir.

env_vars

Supply environment variables to the project

abstract property command: str

Return the current dbt command.

Each subclass of DbtBaseOperator should return its corresponding command.

property dbt_hook: DbtHook

Provides an existing DbtHook or creates one.

execute(context)[source]

Execute dbt command with prepared arguments.

Execution requires setting up a directory with the dbt project files and overriding the logging.

Parameters:

context – The Airflow’s task context

make_run_results_serializable(result)[source]

Makes dbt’s run result JSON-serializable.

Turn dbt’s RunResult into a dict of only JSON-serializable types Each subclas may implement this method to return a dictionary of JSON-serializable types, the default XCom backend. If implementing custom XCom backends, this method may be overriden.

Parameters:

result (Optional[RunResult])

Return type:

Optional[dict[Any, Any]]

xcom_push_dbt_results(context, dbt_results)[source]

Push any dbt results to XCom.

Parameters:
  • context – The Airflow task’s context.

  • dbt_results (DbtTaskResult) – A namedtuple the results of executing a dbt task, as returned by DbtHook.

Return type:

None

class airflow_dbt_python.operators.dbt.DbtBuildOperator(full_refresh=None, select=None, exclude=None, selector_name=None, singular=None, generic=None, show=None, indirect_selection=None, **kwargs)[source]

Executes a dbt build command.

The build command combines the run, test, seed, and snapshot commands into one. The full Documentation for the dbt build command can be found here: https://docs.getdbt.com/reference/commands/build.

Parameters:
  • full_refresh (Optional[bool])

  • select (Optional[list[str]])

  • exclude (Optional[list[str]])

  • selector_name (Optional[str])

  • singular (Optional[bool])

  • generic (Optional[bool])

  • show (Optional[bool])

  • indirect_selection (Optional[str])

property command: str

Return the build command.

class airflow_dbt_python.operators.dbt.DbtCleanOperator(upload_dbt_project=True, delete_before_upload=True, clean_project_files_only=True, **kwargs)[source]

Executes a dbt clean command.

The documentation for the dbt command can be found here: https://docs.getdbt.com/reference/commands/debug.

Parameters:
  • upload_dbt_project (bool)

  • delete_before_upload (bool)

  • clean_project_files_only (bool)

property command: str

Return the clean command.

class airflow_dbt_python.operators.dbt.DbtCompileOperator(parse_only=None, full_refresh=None, models=None, select=None, exclude=None, selector_name=None, upload_dbt_project=True, **kwargs)[source]

Executes a dbt compile command.

The compile command generates SQL files. The documentation for the dbt command can be found here: https://docs.getdbt.com/reference/commands/compile.

Parameters:
  • parse_only (Optional[bool])

  • full_refresh (Optional[bool])

  • models (Optional[list[str]])

  • select (Optional[list[str]])

  • exclude (Optional[list[str]])

  • selector_name (Optional[str])

  • upload_dbt_project (bool)

property command: str

Return the compile command.

class airflow_dbt_python.operators.dbt.DbtDebugOperator(config_dir=None, no_version_check=None, **kwargs)[source]

Executes a dbt debug command.

The documentation for the dbt command can be found here: https://docs.getdbt.com/reference/commands/debug.

Parameters:
  • config_dir (Optional[bool])

  • no_version_check (Optional[bool])

property command: str

Return the debug command.

class airflow_dbt_python.operators.dbt.DbtDepsOperator(upload_dbt_project=True, **kwargs)[source]

Executes a dbt deps command.

The documentation for the dbt command can be found here: https://docs.getdbt.com/reference/commands/deps.

Parameters:

upload_dbt_project (bool)

property command: str

Return the deps command.

class airflow_dbt_python.operators.dbt.DbtDocsGenerateOperator(compile=True, upload_dbt_project=True, **kwargs)[source]

Executes a dbt docs generate command.

The documentation for the dbt command can be found here: https://docs.getdbt.com/reference/commands/cmd-docs.

Parameters:

upload_dbt_project (bool)

property command: str

Return the generate command.

airflow_dbt_python.operators.dbt.DbtListOperator

alias of DbtLsOperator

class airflow_dbt_python.operators.dbt.DbtLsOperator(resource_types=None, select=None, exclude=None, selector_name=None, dbt_output=None, output_keys=None, indirect_selection=None, **kwargs)[source]

Executes a dbt list (or ls) command.

The documentation for the dbt command can be found here: https://docs.getdbt.com/reference/commands/list.

Parameters:
  • resource_types (Optional[list[str]])

  • select (Optional[list[str]])

  • exclude (Optional[list[str]])

  • selector_name (Optional[str])

  • dbt_output (Optional[Output])

  • output_keys (Optional[list[str]])

  • indirect_selection (Optional[str])

property command: str

Return the list command.

class airflow_dbt_python.operators.dbt.DbtParseOperator(upload_dbt_project=True, **kwargs)[source]

Executes a dbt parse command.

The documentation for the dbt command can be found here: https://docs.getdbt.com/reference/commands/parse.

Parameters:

upload_dbt_project (bool)

property command: str

Return the parse command.

class airflow_dbt_python.operators.dbt.DbtRunOperationOperator(macro, args=None, **kwargs)[source]

Executes a dbt run-operation command.

The documentation for the dbt command can be found here: https://docs.getdbt.com/reference/commands/run-operation.

Parameters:
  • macro (str)

  • args (Optional[dict[str, str]])

property command: str

Return the run-operation command.

class airflow_dbt_python.operators.dbt.DbtRunOperator(full_refresh=None, models=None, select=None, selector_name=None, exclude=None, **kwargs)[source]

Executes a dbt run command.

The run command executes SQL model files against the given target. The documentation for the dbt command can be found here: https://docs.getdbt.com/reference/commands/run.

Parameters:
  • full_refresh (Optional[bool])

  • models (Optional[list[str]])

  • select (Optional[list[str]])

  • selector_name (Optional[list[str]])

  • exclude (Optional[list[str]])

property command: str

Return the run command.

class airflow_dbt_python.operators.dbt.DbtSeedOperator(full_refresh=None, select=None, show=None, exclude=None, selector_name=None, **kwargs)[source]

Executes a dbt seed command.

The seed command loads csv files into the given target. The documentation for the dbt command can be found here: https://docs.getdbt.com/reference/commands/seed.

Parameters:
  • full_refresh (Optional[bool])

  • select (Optional[list[str]])

  • show (Optional[bool])

  • exclude (Optional[list[str]])

  • selector_name (Optional[str])

property command: str

Return the seed command.

class airflow_dbt_python.operators.dbt.DbtSnapshotOperator(select=None, exclude=None, selector_name=None, **kwargs)[source]

Executes a dbt snapshot command.

The documentation for the dbt command can be found here: https://docs.getdbt.com/reference/commands/snapshot.

Parameters:
  • select (Optional[list[str]])

  • exclude (Optional[list[str]])

  • selector_name (Optional[str])

property command: str

Return the snapshot command.

class airflow_dbt_python.operators.dbt.DbtSourceFreshnessOperator(select=None, dbt_output=None, exclude=None, selector_name=None, upload_dbt_project=True, **kwargs)[source]

Executes a dbt source-freshness command.

The documentation for the dbt command can be found here: https://docs.getdbt.com/reference/commands/source.

Parameters:
  • select (Optional[list[str]])

  • dbt_output (Optional[Union[str, Path]])

  • exclude (Optional[list[str]])

  • selector_name (Optional[str])

  • upload_dbt_project (bool)

property command: str

Return the source command.

class airflow_dbt_python.operators.dbt.DbtTestOperator(singular=None, generic=None, models=None, select=None, exclude=None, selector_name=None, indirect_selection=None, **kwargs)[source]

Executes a dbt test command.

The test command runs data and/or schema tests. The documentation for the dbt command can be found here: https://docs.getdbt.com/reference/commands/test.

Parameters:
  • singular (Optional[bool])

  • generic (Optional[bool])

  • models (Optional[list[str]])

  • select (Optional[list[str]])

  • exclude (Optional[list[str]])

  • selector_name (Optional[str])

  • indirect_selection (Optional[str])

property command: str

Return the test command.

airflow_dbt_python.operators.dbt.run_result_factory(data)[source]

Dictionary factory for dbt’s run_result.

We need to handle dt.datetime and agate.table.Table. The rest of the types should already be JSON-serializable.

Parameters:

data (list[tuple[Any, Any]])