Databricks
This feature is available as an early‑access capability for Databricks. To enable it for your account, contact Flexera Support. For more information, see Contacting Flexera Support.
Flexera One uses bill data to provide an accurate view of your costs across accounts and services. This data is made available for pre-built and ad-hoc analyses. To gather this cost information, you must perform certain configuration steps and share specific data and credentials with Flexera One.
For Databricks, Flexera One ingests billing, compute, and lakeflow data so you can allocate costs, identify waste, and analyze your Databricks spend in detail.
Overview of Databricks Bill Connect
The Databricks bill connect enables you to ingest your Databricks billing and compute data into Flexera One for cost reporting and analysis.
Flexera One pulls data through Databricks system tables and enriches it to provide detailed insights into your Databricks costs. The Databricks bill connect requires an SQL warehouse and a service principal with access to system tables. The service principal uses the warehouse to pull data for all workspaces in that region. To enable ingestion for additional regions, you must configure a warehouse in at least one workspace per region that the service principal can use and where the service principal has access to system tables.
Getting Started With Databricks Bill Connect
Complete the following steps to connect your Databricks billing and compute data to Flexera One for cost reporting purposes:
- Review the Prerequisites
- Creating a Service Principal and Granting Workspace and Warehouse Access
- Granting System Table and Workspace Discovery Access
- Connecting Databricks in Flexera One
- Verifying Bill Connect
- Viewing Import History to Verify Bill Status
You can view your Databricks costs in the built-in Databricks Analyzer and Resource Analyzer dashboards after you complete the bill connect configuration. For more information, see The "Default" Dashboards.
Mapping Flexera One Dimensions to Databricks Billing Columns
The following table describes the Flexera One dimensions and indicates which dimensions map Databricks billing columns:
| Flexera One Dimension | Databricks | Meaning |
|---|---|---|
| Databricks Usage Name | Values can be warehouse.name, clusters.name, or pipeline_details.name | The Databricks compute resource that is being billed, such as a warehouse name, cluster name, job name, pipeline name, or notebook (for serverless). |
| Databricks Workflow Name | Values can be usage_metadata.job_name, the notebook path, or the pipeline name. | The workflow that triggers the compute usage. |
| Billing Account ID | usage.account_id | A unique identifier for the Databricks account. |
| Cloud Vendor Account | workspace_id | A unique identifier for the Databricks workspace. |
| Cloud Vendor Account Name | workspace_name | A human-readable name for the workspace ID. |
| Category | Derived dynamically based on usage.billing_origin_product | The category this usage record belongs to (for example, Compute, Storage, or Network). |
| Instance Type | usage_metadata.node_type | An identifier for common hardware configurations for VMs running a customer workload. For example, m4.xlarge (AWS), D2 v3 (Azure), n1-standard-4 (Google Cloud Platform), or Serverless (Databricks). |
| Line Item Type | Usage | Indicates the type of information the line item is for, such as usage, tax, fee, or credit. For Databricks, this value is always Usage. |
| Region | Derived from usage.sku_name / clusters.aws_attributes.zone_id | The physical region where a resource was consumed. |
| Resource Type | usage.sku_name | Indicates the type of resource consumed within a service. |
| Resource ID | usage.usage_metadata.*_id | The ID of a specific cloud resource. |
| Service | usage.billing_origin_product | The name of the high-level service consumed. For example, AmazonS3 (AWS), Microsoft.Compute (Azure), or Compute Engine (Google Cloud Platform). |
| Usage Type | usage.usage_type | Indicates information about the kind of usage incurred. |
| Tags | usage.custom_tags + clusters.tags + jobs.tags + warehouses.tags | Tags used for defining custom dimensions. |
| Usage Amount | usage.usage_quantity | The amount of usage generated in this record, in the units specified in Usage Unit (set to 0 if not needed). |
| Usage Unit | usage.usage_unit | The units that the Usage Amount metric is reported in. |
| Cost | usage.usage_quantity * list_prices.pricing(usage_unit) | The total cost generated in this record, in the currency specified in Currency Code. |