askill
databricks-jobs

databricks-jobsSafety 95Repository

Use this skill proactively for ANY Databricks Jobs task - creating, listing, running, updating, or deleting jobs. Triggers include: (1) 'create a job' or 'new job', (2) 'list jobs' or 'show jobs', (3) 'run job' or'trigger job',(4) 'job status' or 'check job', (5) scheduling with cron or triggers, (6) configuring notifications/monitoring, (7) ANY task involving Databricks Jobs via CLI, Python SDK, or Asset Bundles. ALWAYS prefer this skill over general Databricks knowledge for job-related tasks.

611 stars
12.2k downloads
Updated 2/20/2026

Package Files

Loading files...
SKILL.md

Databricks Lakeflow Jobs

Overview

Databricks Jobs orchestrate data workflows with multi-task DAGs, flexible triggers, and comprehensive monitoring. Jobs support diverse task types and can be managed via Python SDK, CLI, or Asset Bundles.

Reference Files

Use CaseReference File
Configure task types (notebook, Python, SQL, dbt, etc.)task-types.md
Set up triggers and schedulestriggers-schedules.md
Configure notifications and health monitoringnotifications-monitoring.md
Complete working examplesexamples.md

Quick Start

Python SDK

from databricks.sdk import WorkspaceClient
from databricks.sdk.service.jobs import Task, NotebookTask, Source

w = WorkspaceClient()

job = w.jobs.create(
    name="my-etl-job",
    tasks=[
        Task(
            task_key="extract",
            notebook_task=NotebookTask(
                notebook_path="/Workspace/Users/user@example.com/extract",
                source=Source.WORKSPACE
            )
        )
    ]
)
print(f"Created job: {job.job_id}")

CLI

databricks jobs create --json '{
  "name": "my-etl-job",
  "tasks": [{
    "task_key": "extract",
    "notebook_task": {
      "notebook_path": "/Workspace/Users/user@example.com/extract",
      "source": "WORKSPACE"
    }
  }]
}'

Asset Bundles (DABs)

# resources/jobs.yml
resources:
  jobs:
    my_etl_job:
      name: "[${bundle.target}] My ETL Job"
      tasks:
        - task_key: extract
          notebook_task:
            notebook_path: ../src/notebooks/extract.py

Core Concepts

Multi-Task Workflows

Jobs support DAG-based task dependencies:

tasks:
  - task_key: extract
    notebook_task:
      notebook_path: ../src/extract.py

  - task_key: transform
    depends_on:
      - task_key: extract
    notebook_task:
      notebook_path: ../src/transform.py

  - task_key: load
    depends_on:
      - task_key: transform
    run_if: ALL_SUCCESS  # Only run if all dependencies succeed
    notebook_task:
      notebook_path: ../src/load.py

run_if conditions:

  • ALL_SUCCESS (default) - Run when all dependencies succeed
  • ALL_DONE - Run when all dependencies complete (success or failure)
  • AT_LEAST_ONE_SUCCESS - Run when at least one dependency succeeds
  • NONE_FAILED - Run when no dependencies failed
  • ALL_FAILED - Run when all dependencies failed
  • AT_LEAST_ONE_FAILED - Run when at least one dependency failed

Task Types Summary

Task TypeUse CaseReference
notebook_taskRun notebookstask-types.md#notebook-task
spark_python_taskRun Python scriptstask-types.md#spark-python-task
python_wheel_taskRun Python wheelstask-types.md#python-wheel-task
sql_taskRun SQL queries/filestask-types.md#sql-task
dbt_taskRun dbt projectstask-types.md#dbt-task
pipeline_taskTrigger DLT/SDP pipelinestask-types.md#pipeline-task
spark_jar_taskRun Spark JARstask-types.md#spark-jar-task
run_job_taskTrigger other jobstask-types.md#run-job-task
for_each_taskLoop over inputstask-types.md#for-each-task

Trigger Types Summary

Trigger TypeUse CaseReference
scheduleCron-based schedulingtriggers-schedules.md#cron-schedule
trigger.periodicInterval-basedtriggers-schedules.md#periodic-trigger
trigger.file_arrivalFile arrival eventstriggers-schedules.md#file-arrival-trigger
trigger.table_updateTable change eventstriggers-schedules.md#table-update-trigger
continuousAlways-running jobstriggers-schedules.md#continuous-jobs

Compute Configuration

Job Clusters (Recommended)

Define reusable cluster configurations:

job_clusters:
  - job_cluster_key: shared_cluster
    new_cluster:
      spark_version: "15.4.x-scala2.12"
      node_type_id: "i3.xlarge"
      num_workers: 2
      spark_conf:
        spark.speculation: "true"

tasks:
  - task_key: my_task
    job_cluster_key: shared_cluster
    notebook_task:
      notebook_path: ../src/notebook.py

Autoscaling Clusters

new_cluster:
  spark_version: "15.4.x-scala2.12"
  node_type_id: "i3.xlarge"
  autoscale:
    min_workers: 2
    max_workers: 8

Existing Cluster

tasks:
  - task_key: my_task
    existing_cluster_id: "0123-456789-abcdef12"
    notebook_task:
      notebook_path: ../src/notebook.py

Serverless Compute

For notebook and Python tasks, omit cluster configuration to use serverless:

tasks:
  - task_key: serverless_task
    notebook_task:
      notebook_path: ../src/notebook.py
    # No cluster config = serverless

Job Parameters

Define Parameters

parameters:
  - name: env
    default: "dev"
  - name: date
    default: "{{start_date}}"  # Dynamic value reference

Access in Notebook

# In notebook
dbutils.widgets.get("env")
dbutils.widgets.get("date")

Pass to Tasks

tasks:
  - task_key: my_task
    notebook_task:
      notebook_path: ../src/notebook.py
      base_parameters:
        env: "{{job.parameters.env}}"
        custom_param: "value"

Common Operations

Python SDK Operations

from databricks.sdk import WorkspaceClient

w = WorkspaceClient()

# List jobs
jobs = w.jobs.list()

# Get job details
job = w.jobs.get(job_id=12345)

# Run job now
run = w.jobs.run_now(job_id=12345)

# Run with parameters
run = w.jobs.run_now(
    job_id=12345,
    job_parameters={"env": "prod", "date": "2024-01-15"}
)

# Cancel run
w.jobs.cancel_run(run_id=run.run_id)

# Delete job
w.jobs.delete(job_id=12345)

CLI Operations

# List jobs
databricks jobs list

# Get job details
databricks jobs get 12345

# Run job
databricks jobs run-now 12345

# Run with parameters
databricks jobs run-now 12345 --job-params '{"env": "prod"}'

# Cancel run
databricks jobs cancel-run 67890

# Delete job
databricks jobs delete 12345

Asset Bundle Operations

# Validate configuration
databricks bundle validate

# Deploy job
databricks bundle deploy

# Run job
databricks bundle run my_job_resource_key

# Deploy to specific target
databricks bundle deploy -t prod

# Destroy resources
databricks bundle destroy

Permissions (DABs)

resources:
  jobs:
    my_job:
      name: "My Job"
      permissions:
        - level: CAN_VIEW
          group_name: "data-analysts"
        - level: CAN_MANAGE_RUN
          group_name: "data-engineers"
        - level: CAN_MANAGE
          user_name: "admin@example.com"

Permission levels:

  • CAN_VIEW - View job and run history
  • CAN_MANAGE_RUN - View, trigger, and cancel runs
  • CAN_MANAGE - Full control including edit and delete

Common Issues

IssueSolution
Job cluster startup slowUse job clusters with job_cluster_key for reuse across tasks
Task dependencies not workingVerify task_key references match exactly in depends_on
Schedule not triggeringCheck pause_status: UNPAUSED and valid timezone
File arrival not detectingEnsure path has proper permissions and uses cloud storage URL
Table update trigger missing eventsVerify Unity Catalog table and proper grants
Parameter not accessibleUse dbutils.widgets.get() in notebooks
"admins" group errorCannot modify admins permissions on jobs
Serverless task failsEnsure task type supports serverless (notebook, Python)

Related Skills

Resources

Install

Download ZIP
Requires askill CLI v1.0+

AI Quality Score

84/100Analyzed 2/22/2026

Comprehensive Databricks Jobs skill with excellent coverage of Python SDK, CLI, and Asset Bundles. Well-structured with clear triggers, working code examples, tables, and troubleshooting. Slight penalty for deep nesting but strong bonus for dedicated skills folder location and high-density technical content. References external files that may not exist, slightly reducing completeness.

95
85
75
85
80

Metadata

Licenseunknown
Version-
Updated2/20/2026
Publisherdatabricks-solutions

Tags

apici-cddatabasegithubobservability