Skip to content

Cube

Cube is the semantic layer between BigQuery and downstream reporting tools (dashboards, LLMs, apps, etc.). Cube models live in src/cube/ alongside dbt models and are version-controlled in this repository. Security — row-level filtering, column access policies, group membership — is enforced once in Cube for all downstream consumers.

Jump to: Concepts · Development Workflow · Review and Staging · Local Dev · Using Cube with Claude · Admin Setup

Concepts

Deployment types

Cube Cloud has two deployment types that control infrastructure scaling:

  • Development Instance — deallocates after inactivity and reallocates when a request comes in. Cheaper, but has a cold-start delay on the first query after idle.
  • Production Cluster — always running, no cold starts. Use once downstream tools are live and users expect instant responses.

Environments

A Cube Cloud deployment has two contexts:

  • Production environment — always tracks main. This is what downstream tools (Superset, Streamlit) connect to. Redeploys automatically when main changes.
  • Staging environments — one per branch, activated automatically when a user switches to that branch in the Cube Cloud UI. Each has its own isolated API endpoints. Multiple staging environments can be active simultaneously for different branches. Suspends after 10 minutes of inactivity by default; toggle always active in Settings → Staging Environments to keep a branch live for multi-day stakeholder review.

Development mode is the interactive UI session in Cube Cloud (not a separate environment — it targets whichever branch is currently active in the UI). Switching branches in development mode activates that branch's staging environment.

How KIPP uses them

One deployment covers everything:

Context What it is Tracks
Production Production environment main, auto-redeploys on merge
Staging Per-branch staging environments, separate API URL Any branch, multiple active simultaneously

Staging environments are how analysts test feature branches, reviewers validate changes, and stakeholders preview models before merge — all within one deployment, with no additional infrastructure.

Development Workflow

1. Create a branch

git fetch origin main && git merge origin/main
git checkout -b you/feat/my-cube-change

2. Edit cube models in VS Code

Edit files in src/cube/model/cubes/ or src/cube/model/views/.

If main has new or renamed dbt models since you last compiled, regenerate the local manifest first:

uv run dbt compile --project-dir src/dbt/kipptaf

You only need this when dbt model definitions change. If you are only editing Cube YAML files, the existing manifest stays valid.

Do not use the Playground Models tab.

In dev mode, Cube treats it as a live editor and overwrites YAML files. Edit in VS Code only.

3. Test locally

Run the Cube: Dev Server VS Code task to start Cube at localhost:4000. Use CUBE_GROUP_MAP in your .env to simulate different users' group membership — see Local Dev for setup.

4. Test in Cube Cloud

Push your branch, then switch to it in the Cube Cloud UI's development mode branch switcher. Cube Cloud activates a staging environment for the branch automatically.

Test in the Cube Cloud Playground. Your real Google Workspace group membership applies here — use this to verify security behavior against the real Directory API.

Check:

  • Cubes and views load without errors
  • Queries return expected results against live BigQuery data
  • Row-level security behaves correctly for your groups
  • Existing cubes and views still work (no regressions)

5. Open a PR

When ready, open a pull request from your feature branch to main.

Review and Staging

Peer review

The reviewing analyst:

  1. Reads through the YAML changes in the PR
  2. Switches to the author's branch — in Cube Cloud dev mode or locally — and tests in the Playground:
  3. Do all cubes and views load without errors?
  4. Do queries return expected results against live BigQuery data?
  5. Do existing cubes and views still work?
  6. To test security behavior with specific group combinations, run locally with CUBE_GROUP_MAP set to the groups you want to simulate
  7. Leaves review comments on the PR, or approves

Author and reviewer can work together in the same Cube Cloud Playground session since they're both hitting the same branch.

Stakeholder review

When a business user needs to validate changes before merge:

  1. In Cube Cloud, switch to the feature branch — this activates a staging environment for that branch
  2. Go to Settings → Staging Environments and toggle the branch to always active so queries don't fail when no one is viewing the branch
  3. Find the branch's API URL under API Credentials
  4. Point the staging instance of the connected tool (dashboard, etc.) at that URL and share it with the stakeholder
  5. Stakeholder tests queries and dashboards against live data — multiple branches can have active staging environments simultaneously
  6. Once the stakeholder approves, merge the PR — production redeploys automatically from main

Local Dev

  1. cp src/cube/.env.example src/cube/.env
  2. Fill in CUBE_GROUP_MAP with your email and the groups you want to simulate:
    CUBE_GROUP_MAP='{"you@apps.teamschools.org":["cube-network-detail"]}'
    
  3. Run the Cube: Dev Server VS Code task (Ctrl+Shift+P → Tasks: Run Task)
  4. Playground opens at http://localhost:4000

ADC is used for BigQuery auth locally — run the GCloud: Application Default Login VS Code task first if you haven't already.

Warnings

Do not set CUBE_GROUP_MAP in Cube Cloud — it bypasses the Directory API entirely and must only be used for local dev. The cube.js guard relies on NODE_ENV !== "production" as a second line of defense, but the variable should never be configured in Cube Cloud in the first place.

Do not use the Cube Playground Models tab in dev mode. It overwrites YAML files in model/cubes/ and model/views/ with auto-generated content, discarding hand-authored definitions.

Using Cube with Claude

The Cube MCP server lets Claude query your organization's data using plain English — no SQL required. Once connected, you can ask questions like:

  • "What metrics are available?"
  • "Show me ADA by school for this year"
  • "What are the available dimensions in the student cube?"

Claude uses the Cube semantic layer to find and return the right data. You'll need a Cube API key from the data team before getting started.

Claude Desktop

  1. Install Node.js if you don't have it:
node --version

If you see a version number, skip ahead. Otherwise install via Homebrew:

brew install node

Then find the full path to npx — you'll need it below:

which npx
  1. Open the config file. In Claude Desktop, go to Settings → Developer → Edit Config. Or navigate directly in Finder:
~/Library/Application Support/Claude/claude_desktop_config.json

If the file doesn't exist yet, create it with an empty {}.

  1. Add the Cube MCP server. Replace [YOUR-API-KEY] with your key, and update the command path if your npx location differs:
{
  "mcpServers": {
    "cube-mcp-server": {
      "command": "/opt/homebrew/bin/npx",
      "args": [
        "-y",
        "mcp-remote",
        "https://ai.gcp-us-central1.cubecloud.dev/api/mcp",
        "--transport",
        "http"
      ],
      "env": {
        "CUBE_TOKEN": "[YOUR-API-KEY]"
      }
    }
  }
}

If your config already has content, add mcpServers alongside the existing keys — don't replace anything.

  1. Restart Claude Desktop. Press Cmd+Q to fully quit (don't just close the window), then reopen. A tools/hammer icon in the bottom-right of the chat input confirms the server is connected.

!!! warning "Keep your API key private." Treat it like a password — don't share your config file with anyone outside the pilot group.

Claude Code (VS Code)

In the Codespace, the Cube MCP server is already configured in .mcp.json — no manual setup required. When Claude Code first tries to use it, you'll be prompted to OAuth into Cube Cloud. Approve the connection and you're ready to query.

Using Cube in Claude

Once connected, toggle Cube MCP on via the + or tools menu in your chat, then ask questions in plain English:

  • "What data do you have access to?"
  • "What metrics can I query?"
  • "Show me [metric] by [dimension] for [time period]"

Claude interprets your question using the Cube semantic layer and returns results directly in the chat. No table names, field names, or SQL required.

Troubleshooting

"Server disconnected" error — Claude can't find npx. Run which npx in Terminal and make sure the path in your config matches exactly.

npx not found — Node.js isn't installed. Follow step 1 above.

Tools icon doesn't appear after restart — Your JSON has a formatting error (missing comma, mismatched brackets). Paste the file into jsonlint.com to check.

Check the logs — For any other issue, check the MCP server log:

tail -f ~/Library/Logs/Claude/mcp-server-cube-mcp-server.log

Admin Setup

Why a Service Account Key Is Required

Cube looks up each user's Google Workspace group membership at query time using the Admin Directory API. It calls the API as a service account with domain-wide delegation — a mechanism that lets the service account impersonate a Workspace super-admin, which is required for cross-domain group lookups.

Keyless authentication (Workload Identity Federation) would eliminate the need for a stored key, but it requires the workload to present an OIDC token from a trusted identity provider to Google's Security Token Service. Cube Cloud is managed SaaS running on Cube's infrastructure — KIPP cannot configure Workload Identity on those pods. A service account key is the only viable credential for this deployment.

If Cube is ever migrated to a self-hosted deployment on KIPP's GKE cluster, Workload Identity can replace the key — GoogleAuth would pick up ambient ADC credentials automatically and GOOGLE_DIRECTORY_SA_KEY could be removed.

Directory API Service Account Setup

Performed once by an admin. Creates the service account used by Cube to look up Workspace group membership.

1. Enable the Admin SDK API

In the GCP Console for project teamster-332318, search for Admin SDK API and enable it.

2. Create the service account

In IAM & Admin → Service Accounts, create a new service account:

  • Name: cube-directory-reader
  • ID: cube-directory-reader@teamster-332318.iam.gserviceaccount.com
  • No GCP IAM roles — access is granted via Workspace DWD, not GCP IAM

Do not reuse the BigQuery service account (CUBEJS_DB_BQ_CREDENTIALS). The BigQuery SA has GCP data access; combining it with DWD means a single key compromise grants both warehouse access and domain-wide group enumeration.

3. Create a JSON key

In the service account details, go to Keys → Add key → Create new key → JSON. Download the file and keep it secure.

4. Grant domain-wide delegation in Google Workspace

In Google Workspace Admin, go to Security → Access and data control → API controls → Domain-wide delegation → Add new:

  • Client ID: the numeric client ID from the service account details page in GCP
  • OAuth scopes: https://www.googleapis.com/auth/admin.directory.group.readonly

5. Encode the key and set environment variables

Base64-encode the JSON key (no line wrapping):

base64 -w 0 key.json

Set the following in Cube Cloud:

  • GOOGLE_DIRECTORY_SA_KEY — the base64-encoded output
  • GOOGLE_DIRECTORY_SA_SUBJECT — email of the dedicated Workspace admin account used as the impersonation subject (see below)

Delete the key file after encoding — never store the raw JSON.

Key rotation

Rotate the key if it is ever exposed. In GCP, create a new key on the service account, update GOOGLE_DIRECTORY_SA_KEY in Cube Cloud, then delete the old key.

6. Create the impersonation subject account

The Directory API requires the service account to act as a Workspace super-admin — this is the GOOGLE_DIRECTORY_SA_SUBJECT value. It must be a dedicated shared admin account, not a personal one. If the account is suspended, deleted, or has super-admin revoked, every Cube query will fail with a default deny (no data visible to any user).

In Google Workspace Admin:

  1. Directory → Users → Add new user — create cube-service@apps.teamschools.org (or similar)
  2. Account → Admin roles → Super Admin → Assign on that user

Set GOOGLE_DIRECTORY_SA_SUBJECT to that account's email in Cube Cloud.

Never use a personal admin account as the impersonation subject.

Cube Cloud One-Time Setup

Performed in the Cube Cloud UI by an admin:

  1. Create a new Cube Cloud deployment — use Development Instance type for now; switch to Production Cluster before connecting downstream tools (Superset, Streamlit) so queries don't hit a cold start
  2. Connect the TEAMSchools/teamster GitHub repository
  3. Set the Cube project path to src/cube/
  4. Set the production branch to main — merges trigger automatic redeploy
  5. Set the following environment variables in Cube Cloud:
  6. CUBEJS_DB_TYPE=bigquery
  7. CUBEJS_DB_BQ_PROJECT_ID=teamster-332318
  8. CUBEJS_DB_BQ_CREDENTIALS — service account JSON (base64-encoded)
  9. GOOGLE_DIRECTORY_SA_KEY — Admin Directory API service account (base64-encoded)
  10. GOOGLE_DIRECTORY_SA_SUBJECT — email of the dedicated Workspace admin account used as the impersonation subject (e.g. cube-service@apps.teamschools.org)
  11. CUBEJS_SQL_SUPER_USER=cube-superset-service — SQL API super-user for Superset user impersonation (follow-up integration)

Cube Cloud automatically generates CUBEJS_API_SECRET, the SQL API username, and the SQL API password on deployment creation — find them under the deployment's Settings → Environment Variables. Do not set these manually.

  1. The service account for BigQuery needs roles/bigquery.dataViewer and roles/bigquery.jobUser on the teamster-332318 project
  2. The Admin Directory API service account needs domain-wide delegation scoped to https://www.googleapis.com/auth/admin.directory.group.readonly