Concepts
Dagster provides a variety of abstractions for building and orchestrating data pipelines. These concepts enable a modular, declarative approach to data engineering, making it easier to manage dependencies, monitor execution, and ensure data quality.
Asset
An asset represents a logical unit of data such as a table, dataset, or machine learning model. Assets can have dependencies on other assets, forming the data lineage for your pipelines. As the core abstraction in Dagster, assets can interact with many other Dagster concepts to facilitate certain tasks.
| Concept | Relationship | 
|---|---|
| asset check | assetmay use anasset check | 
| config | assetmay use aconfig | 
| io manager | assetmay use aio manager | 
| partition | assetmay use apartition | 
| resource | assetmay use aresource | 
| job | assetmay be used in ajob | 
| schedule | assetmay be used in aschedule | 
| sensor | assetmay be used in asensor | 
| definitions | assetmust be set in adefinitionsto be deployed | 
Asset Check
An asset_check is associated with an asset to ensure it meets certain expectations around data quality, freshness or completeness. Asset checks run when the asset is executed and store metadata about the related run and if all the conditions of the check were met.
| Concept | Relationship | 
|---|---|
| asset | asset checkmay be used by anasset | 
| definitions | asset checkmust be set in adefinitionsto be deployed | 
Code Location
A code location is a collection of Definitions deployed in a specific environment. A code location determines the Python environment (including the version of Dagster being used as well as any other Python dependencies). A Dagster project can have multiple code locations, helping isolate dependencies.
| Concept | Relationship | 
|---|---|
| definitions | code locationmust contain at least onedefinitions | 
Config
A RunConfig is a set schema applied to a Dagster object that is input at the time of execution. This allows for parameterization and the reuse of pipelines to serve multiple purposes.
| Concept | Relationship | 
|---|---|
| asset | configmay be used by anasset | 
| job | configmay be used by ajob | 
| schedule | configmay be used by aschedule | 
| sensor | configmay be used by asensor | 
Definitions
Definitions is a top-level construct that contains references to all the objects of a Dagster project, such as assets, jobs and ScheduleDefinitions. Only objects included in the definitions will be deployed and visible within the Dagster UI.
| Concept | Relationship | 
|---|---|
| asset | definitionsmay contain one or moreassets | 
| asset check | definitionsmay contain one or moreasset checks | 
| io manager | definitionsmay contain one or moreio managers | 
| job | definitionsmay contain one or morejobs | 
| resource | definitionsmay contain one or moreresources | 
| schedule | definitionsmay contain one or moreschedules | 
| sensor | definitionsmay contain one or moresensors | 
| code location | definitionsmust be deployed in acode location | 
Graph
A GraphDefinition connects multiple ops together to form a DAG. If you are using assets, you will not need to use graphs directly.
| Concept | Relationship | 
|---|---|
| config | graphmay use aconfig | 
| op | graphmust include one or moreops | 
| job | graphmust be part ofjobto execute | 
IO Manager
An IOManager defines how data is stored and retrieved between the execution of assets and ops. This allows for a customizable storage and format at any interaction in a pipeline.
| Concept | Relationship | 
|---|---|
| asset | io managermay be used by anasset | 
| definitions | io managermust be set in adefinitionsto be deployed | 
Job
A job is a subset of assets or the GraphDefinition of ops. Jobs are the main form of execution in Dagster.
| Concept | Relationship | 
|---|---|
| asset | jobmay contain a selection ofassets | 
| config | jobmay use aconfig | 
| graph | jobmay contain agraph | 
| schedule | jobmay be used by aschedule | 
| sensor | jobmay be used by asensor | 
| definitions | jobmust be set in adefinitionsto be deployed | 
Op
An op is a computational unit of work. Ops are arranged into a GraphDefinition to dictate their order. Ops have largely been replaced by assets.
| Concept | Relationship | 
|---|---|
| type | opmay use atype | 
| graph | opmust be contained ingraphto execute | 
Partition
A PartitionsDefinition represents a logical slice of a dataset or computation mapped to a certain segments (such as increments of time). Partitions enable incremental processing, making workflows more efficient by only running on relevant subsets of data.
| Concept | Relationship | 
|---|---|
| asset | partitionmay be used by anasset | 
Resource
A ConfigurableResource is a configurable external dependency. These can be databases, APIs, or anything outside of Dagster.
| Concept | Relationship | 
|---|---|
| asset | resourcemay be used by anasset | 
| schedule | resourcemay be used by aschedule | 
| sensor | resourcemay be used by asensor | 
| definitions | resourcemust be set in adefinitionsto be deployed | 
Type
A type is a way to define and validate the data passed between ops.
| Concept | Relationship | 
|---|---|
| op | typemay be used by anop | 
Schedule
A ScheduleDefinition is a way to automate jobs or assets to occur on a specified interval. In the cases that a job or asset is parameterized, the schedule can also be set with a run configuration (RunConfig) to match.
| Concept | Relationship | 
|---|---|
| asset | schedulemay include ajobor selection ofassets | 
| config | schedulemay include aconfigif thejoborassetsinclude aconfig | 
| job | schedulemay include ajobor selection ofassets | 
| definitions | schedulemust be set in adefinitionsto be deployed | 
Sensor
A sensor is a way to trigger jobs or assets when an event occurs, such as a file being uploaded or a push notification. In the cases that a job or asset is parameterized, the sensor can also be set with a run configuration (RunConfig) to match.
| Concept | Relationship | 
|---|---|
| asset | sensormay include ajobor selection ofassets | 
| config | sensormay include aconfigif thejoborassetsinclude aconfig | 
| job | sensormay include ajobor selection ofassets | 
| definitions | sensormust be set in adefinitionsto be deployed |