askill
model-serving

model-servingSafety 90Repository

Deploy and query Databricks Model Serving endpoints. Use when (1) deploying MLflow models or AI agents to endpoints, (2) creating ChatAgent/ResponsesAgent agents, (3) integrating UC Functions or Vector Search tools, (4) querying deployed endpoints, (5) checking endpoint status. Covers classical ML models, custom pyfunc, and GenAI agents.

177 stars
3.5k downloads
Updated 2/6/2026

Package Files

Loading files...
SKILL.md

Databricks Model Serving

Deploy MLflow models and AI agents to scalable REST API endpoints.

Quick Decision: What Are You Deploying?

Model TypePatternReference
Traditional ML (sklearn, xgboost)mlflow.sklearn.autolog()1-classical-ml.md
Custom Python modelmlflow.pyfunc.PythonModel2-custom-pyfunc.md
GenAI Agent (LangGraph, tool-calling)ResponsesAgent3-genai-agents.md

Prerequisites

  • DBR 16.1+ recommended (pre-installed GenAI packages)
  • Unity Catalog enabled workspace
  • Model Serving enabled

Reference Files

TopicFileWhen to Read
Classical ML1-classical-ml.mdsklearn, xgboost, autolog
Custom PyFunc2-custom-pyfunc.mdCustom preprocessing, signatures
GenAI Agents3-genai-agents.mdResponsesAgent, LangGraph
Tools Integration4-tools-integration.mdUC Functions, Vector Search
Development & Testing5-development-testing.mdMCP workflow, iteration
Logging & Registration6-logging-registration.mdmlflow.pyfunc.log_model
Deployment7-deployment.mdJob-based async deployment
Querying Endpoints8-querying-endpoints.mdSDK, REST, MCP tools
Package Requirements9-package-requirements.mdDBR versions, pip

Quick Start: Deploy a GenAI Agent

Step 1: Install Packages (in notebook or via MCP)

%pip install -U mlflow==3.6.0 databricks-langchain langgraph==0.3.4 databricks-agents pydantic
dbutils.library.restartPython()

Or via MCP:

execute_databricks_command(code="%pip install -U mlflow==3.6.0 databricks-langchain langgraph==0.3.4 databricks-agents pydantic")

Step 2: Create Agent File

Create agent.py locally with ResponsesAgent pattern (see 3-genai-agents.md).

Step 3: Upload to Workspace

upload_folder(
    local_folder="./my_agent",
    workspace_folder="/Workspace/Users/you@company.com/my_agent"
)

Step 4: Test Agent

run_python_file_on_databricks(
    file_path="./my_agent/test_agent.py",
    cluster_id="<cluster_id>"
)

Step 5: Log Model

run_python_file_on_databricks(
    file_path="./my_agent/log_model.py",
    cluster_id="<cluster_id>"
)

Step 6: Deploy (Async via Job)

See 7-deployment.md for job-based deployment that doesn't timeout.

Step 7: Query Endpoint

query_serving_endpoint(
    name="my-agent-endpoint",
    messages=[{"role": "user", "content": "Hello!"}]
)

Quick Start: Deploy a Classical ML Model

import mlflow
import mlflow.sklearn
from sklearn.linear_model import LogisticRegression

# Enable autolog with auto-registration
mlflow.sklearn.autolog(
    log_input_examples=True,
    registered_model_name="main.models.my_classifier"
)

# Train - model is logged and registered automatically
model = LogisticRegression()
model.fit(X_train, y_train)

Then deploy via UI or SDK. See 1-classical-ml.md.


MCP Tools

If MCP tools are not available, use the SDK/CLI examples in the reference files below.

Development & Testing

ToolPurpose
upload_folderUpload agent files to workspace
run_python_file_on_databricksTest agent, log model
execute_databricks_commandInstall packages, quick tests

Deployment

ToolPurpose
create_jobCreate deployment job (one-time)
run_job_nowKick off deployment (async)
get_runCheck deployment job status

Querying

ToolPurpose
get_serving_endpoint_statusCheck if endpoint is READY
query_serving_endpointSend requests to endpoint
list_serving_endpointsList all endpoints

Common Workflows

Check Endpoint Status After Deployment

get_serving_endpoint_status(name="my-agent-endpoint")

Returns:

{
    "name": "my-agent-endpoint",
    "state": "READY",
    "served_entities": [...]
}

Query a Chat/Agent Endpoint

query_serving_endpoint(
    name="my-agent-endpoint",
    messages=[
        {"role": "user", "content": "What is Databricks?"}
    ],
    max_tokens=500
)

Query a Traditional ML Endpoint

query_serving_endpoint(
    name="sklearn-classifier",
    dataframe_records=[
        {"age": 25, "income": 50000, "credit_score": 720}
    ]
)

Common Issues

IssueSolution
Invalid output formatUse self.create_text_output_item(text, id) - NOT raw dicts!
Endpoint NOT_READYDeployment takes ~15 min. Use get_serving_endpoint_status to poll.
Package not foundSpecify exact versions in pip_requirements when logging model
Tool timeoutUse job-based deployment, not synchronous calls
Auth error on endpointEnsure resources specified in log_model for auto passthrough
Model not foundCheck Unity Catalog path: catalog.schema.model_name

Critical: ResponsesAgent Output Format

WRONG - raw dicts don't work:

return ResponsesAgentResponse(output=[{"role": "assistant", "content": "..."}])

CORRECT - use helper methods:

return ResponsesAgentResponse(
    output=[self.create_text_output_item(text="...", id="msg_1")]
)

Available helper methods:

  • self.create_text_output_item(text, id) - text responses
  • self.create_function_call_item(id, call_id, name, arguments) - tool calls
  • self.create_function_call_output_item(call_id, output) - tool results

Resources

Install

Download ZIP
Requires askill CLI v1.0+

AI Quality Score

95/100Analyzed 2/11/2026

An exceptionally well-structured skill for Databricks Model Serving. It provides clear decision matrices, step-by-step guides for different model types, and integrates specific MCP tool usage for agentic workflows. The inclusion of common issues and critical formatting requirements adds significant practical value.

90
95
90
95
98

Metadata

Licenseunknown
Version-
Updated2/6/2026
Publisherdatabricks-solutions

Tags

apici-cdgithub-actionsobservabilitysecuritytesting