← Back to all products

Databricks Workspace Toolkit

$59

Workspace provisioning scripts, cluster policies, notebook templates, Unity Catalog setup, and job orchestration patterns.

📁 17 files🏷 v1.0.0
PythonYAMLTOMLShellJSONMarkdownAzureDatabricksSpark

📁 File Structure 17 files

databricks-workspace-toolkit/ ├── LICENSE ├── README.md ├── configs/ │ ├── cluster_policies.json │ ├── job_templates/ │ │ ├── etl_job.json │ │ └── ml_training_job.json │ └── workspace_config.yaml ├── guides/ │ └── workspace-management.md ├── notebooks/ │ └── admin_dashboard.py ├── scripts/ │ ├── export_workspace.sh │ └── setup_workspace.sh └── src/ ├── cluster_manager.py ├── job_manager.py ├── permissions_manager.py ├── secret_manager.py ├── unity_catalog_setup.py └── workspace_manager.py

📖 Documentation Preview README excerpt

Databricks Workspace Toolkit

Automate Databricks workspace management — clusters, jobs, secrets, Unity Catalog, and permissions.

Stop clicking through the UI. Manage your entire Databricks workspace programmatically with production-ready Python wrappers around the Databricks REST APIs.

---

What You Get

  • Workspace management — List, create, delete, import/export notebooks programmatically
  • Cluster automation — Create clusters, resize, manage pools, enforce auto-termination policies
  • Job orchestration — Create multi-task workflows, manage schedules, configure notifications
  • Secret management — Create scopes, store secrets, manage ACLs for secure credential handling
  • Unity Catalog setup — Bootstrap catalogs, schemas, tables, grants, and external locations
  • Permissions manager — Configure RBAC for clusters, jobs, notebooks, and SQL warehouses
  • Cluster policies — Cost control templates with instance type restrictions and spot pricing
  • Job templates — Ready-to-use ETL and ML training job definitions
  • Shell scripts — Bootstrap workspace setup and backup/export notebooks
  • Admin dashboard — Databricks notebook showing cluster usage, job status, and costs

File Tree


databricks-workspace-toolkit/
├── README.md
├── manifest.json
├── LICENSE
├── src/
│   ├── workspace_manager.py       # Notebook CRUD, import/export
│   ├── cluster_manager.py         # Cluster lifecycle, pools, policies
│   ├── job_manager.py             # Jobs API: create, run, notifications
│   ├── secret_manager.py          # Secret scopes and ACLs
│   ├── unity_catalog_setup.py     # UC bootstrap: catalogs, schemas, grants
│   └── permissions_manager.py     # RBAC for workspace resources
├── configs/
│   ├── cluster_policies.json      # Cost control cluster policies
│   ├── workspace_config.yaml      # Environment configuration
│   └── job_templates/
│       ├── etl_job.json           # Multi-task ETL pipeline job
│       └── ml_training_job.json   # ML training with GPU cluster
├── scripts/
│   ├── setup_workspace.sh         # Bootstrap workspace setup
│   └── export_workspace.sh        # Backup notebooks and configs
├── notebooks/
│   └── admin_dashboard.py         # Admin overview dashboard
└── guides/
    └── workspace-management.md    # Best practices guide

Getting Started

1. Configure Your Workspace


# configs/workspace_config.yaml
workspace:
  host: "https://adb-1234567890.12.azuredatabricks.net"
  token_env_var: "DATABRICKS_TOKEN"

... continues with setup instructions, usage examples, and more.

📄 Code Sample .py preview

src/cluster_manager.py """ Databricks Cluster Manager =========================== Create, configure, resize, and manage Databricks clusters and instance pools via the Clusters API 2.0. Datanest Digital | https://datanest.dev """ from __future__ import annotations import logging import os import time from dataclasses import dataclass from typing import Any, Optional import requests import yaml logger = logging.getLogger(__name__) class ClusterManager: """Manage Databricks clusters and instance pools.""" def __init__(self, host: str, token: str, timeout: int = 30) -> None: self._host = host.rstrip("/") self._timeout = timeout self._session = requests.Session() self._session.headers.update({ "Authorization": f"Bearer {token}", "Content-Type": "application/json", }) @classmethod def from_config(cls, config_path: str) -> ClusterManager: """Create ClusterManager from a YAML config file.""" with open(config_path) as f: config = yaml.safe_load(f) ws = config["workspace"] token = os.environ.get(ws.get("token_env_var", "DATABRICKS_TOKEN"), "") return cls(host=ws["host"], token=token, timeout=ws.get("timeout", 30)) def _api(self, method: str, endpoint: str, **kwargs: Any) -> dict: """Make an authenticated API request.""" url = f"{self._host}/api/2.0{endpoint}" resp = self._session.request(method, url, timeout=self._timeout, **kwargs) resp.raise_for_status() # ... 225 more lines ...