Dagster
dbt external sources
When build_dbt_assets() runs, it automatically stages any external sources
upstream of the selected assets before executing dbt build:
- No Dagster code changes are needed when adding a new external source. Add the source to your dbt project's YAML and it will be staged on the next run.
- For external sources using a BigLake
connection_name, the metadata cache is also refreshed automatically.
To stage sources manually (e.g. after adding a new one locally):
uv run dbt run-operation stage_external_sources \
--vars "{'ext_full_refresh': 'true'}" \
--args 'select: [model_name]'
Tableau Workbooks
Ownership: Dagster-scheduled vs externally-scheduled
Before adding a cron_schedule to a Tableau workbook exposure, decide whether
Dagster owns the refresh trigger:
- Dagster owns it → add
cron_scheduleto the exposure'sasset.metadata. Dagster creates a schedule and manages the refresh. - Tableau Server owns it → omit
cron_schedule. The workbook still appears in the asset graph (via theidmetadata) but its refresh is not managed by Dagster.
Do not set cron_schedule for workbooks already scheduled in Tableau Server —
this causes double-refreshes.
Managing refresh schedules
Tableau workbook assets are configured in
src/teamster/code_locations/kipptaf/tableau/config/assets.yaml.
Set assets[*].metadata.cron_schedule to a cron string (single schedule) or a
list of strings (multiple ticks). Use crontab guru as a
reference.
- name: My Tableau Workbook
deps:
- [spam, eggs, jeff]
metadata:
id: 0n371m37-h34c-702w-h0p1-4y3dm2831v3d
cron_schedule: 0 2 * * *
- name: My Other Tableau Workbook
deps:
- [foo, bar, baz]
metadata:
id: 3235470n-h158-4115-4nd7-h3yh4d705hu7
cron_schedule:
- 0 0 * * *
- 0 12 * * *
To disable a schedule, comment out the cron_schedule key:
- name: My Tableau Workbook
metadata:
id: 0n371m37-h34c-702w-h0p1-4y3dm2831v3d
# cron_schedule: 0 2 * * *
Backfilling partitions
To re-run a range of historical partitions (e.g. after a bug fix or schema change):
- Navigate to the asset in Dagster Cloud.
- Click the Partitions tab.
- Select the partition range you need — use Shift+click to select a range.
- Click Materialize selected.
For a single failed partition, filter by Failed status first, then materialize only those.
Note
Backfilling many partitions at once can saturate the Kubernetes pod pool. For large backfills (50+ partitions), materialize in batches or coordinate with the data engineering team.
Re-materializing assets
To force a single asset to re-run regardless of its current state:
- Click the asset in the asset graph or asset catalog.
- Click Materialize (top right).
- To also re-run all downstream assets, click the dropdown arrow next to Materialize and choose Materialize with downstream.
Branch deployments
Every PR that touches Python or Dagster config automatically creates a
branch deployment — an isolated Dagster
environment running your branch's code against the teamster-test GCS bucket.
Use branch deployments to:
- Verify new assets materialize without errors before merging
- Check that definition changes (new schedules, sensors, resources) load correctly — run Reload definitions in the branch deployment UI
Branch deployments are torn down automatically when the PR is closed.
Monitoring a run in progress
- Go to Runs in the left nav.
- Click the active run.
- Use the Gantt view to see which steps are running, queued, or failed.
- Click any step to see its logs inline.
- For dbt steps, click Raw compute logs to see the full dbt output including model-level errors.