スキルOfficialdevelopment

🖥️aidp-cluster-ops

プラグイン: oracle-ai-data-platform-workbench-engineer-agent
ソース: GitHub で見る ↗

説明

AIDP Spark コンピュートクラスターを管理します — クラスターの一覧表示、ステータス確認、起動／停止／再起動、インストール済みライブラリの管理（JAR／Python）、新規クラスターのプロビジョニング／スケーリング（ドライバー／ワーカーシェイプ、オートスケール、GPU／RAPIDS、AI Compute）、および外部 BI ツールの接続（JDBC／ODBC）に対応しています。次のような場合に使用: - クラスターについて質問がある - コンピュートの起動／停止が必要 - クラスターの作成またはスケーリングを行いたい - ライブラリをインストールしたい - GPU クラスターをセットアップしたい - Agent フローに AI Compute を使用したい - Tableau／Power BI／DBeaver を接続したい - データ処理を実行する前にクラスターを選択したい

原文を表示

Manage AIDP Spark compute clusters — list, status, start/stop/restart, installed libraries (JARs/Python), provision/scale a new cluster (driver/worker shapes, autoscale, GPU/RAPIDS, AI Compute), and connect external BI tools (JDBC/ODBC). Use when the user asks about clusters, needs to start/stop compute, create or scale a cluster, install libraries, set up a GPU cluster, use AI Compute for agent flows, connect Tableau/Power BI/DBeaver, or pick a cluster before running data work.

ユースケース

✓クラスターの起動／停止／再起動を行うとき
✓クラスターを作成またはスケーリングしたいとき
✓ライブラリをインストールするとき
✓GPUクラスターをセットアップするとき
✓BIツールを接続するとき

本文（日本語訳）

`aidp-cluster-ops` — クラスターライフサイクル & ライブラリ

AIDP Spark クラスターの状態確認と操作を行います。ほとんどのデータ系スキルは RUNNING 状態のクラスターに依存するため、これは共通の前処理ステップとなります。これは コントロールプレーン スキルです。MCP および ai-data-engineer-agent リポジトリは不要です。

次のような場合に使用

「どんなクラスターがあるか／X は起動しているか」
「クラスターを起動／停止／再起動したい」
「インストールされているライブラリを確認したい」
「データ処理を実行する前にコンピュートが稼働しているか確認したい」

エンジン — 公式 `aidp` CLI（コントロールプレーン）

推奨エンジンは Oracle 公式の aidp CLI です。 CLI がインストールされていない場合のフォールバックとして oci raw-request を使用します。どちらも同じ認証で同じデータプレーン REST API にアクセスします。

コマンドマップの全体像: references/aidp-cli-map.md
ベース URL・認証ラダー・非同期/エラー規約: references/oci-raw-request.md
REST エンドポイントの形式: references/no-mcp-rest-map.md

操作	CLI（推奨）	REST フォールバック
クラスター一覧	`aidp cluster list`	`GET /clusters` または `GET /workspaces/<ws>/clusters`
ステータス / 設定	`aidp cluster get --cluster-key <key>`	`GET /workspaces/<ws>/clusters/<key>`
デフォルトクラスター	`aidp cluster get-default`	（一覧の出力に含まれる）
ライブラリ	`aidp cluster list-libraries --cluster-key <key>`（追加/削除は `patch-library`）	クラスターの GET レスポンス内に含まれる
起動 / 停止 / 再起動	`aidp cluster start\|stop\|restart --cluster-key <key>`	`POST /workspaces/<ws>/clusters/<key>/actions/start\|stop\|restart`
ログ / メトリクス	`aidp cluster search-logs\|download-logs\|summarize-metrics-data --cluster-key <key>`	`…/clusters/<key>/…`

すべての CLI 呼び出しに --instance-id <DATALAKE_OCID> --auth api_key --profile DEFAULT --region <r> が必要です。

# CLI（推奨）: クラスター一覧
aidp cluster list --instance-id <DATALAKE_OCID> --auth api_key --profile DEFAULT --region us-ashburn-1

# CLI（推奨）: クラスター詳細 — 状態、設定、接続情報
aidp cluster get --cluster-key <KEY> --instance-id <DATALAKE_OCID> --auth api_key --profile DEFAULT --region us-ashburn-1

# CLI（推奨）: 起動（必要なリクエストボディは CLI が自動設定）
aidp cluster start --cluster-key <KEY> --instance-id <DATALAKE_OCID> --auth api_key --profile DEFAULT --region us-ashburn-1

変更系操作（start/stop/restart、patch-library）— 共有クラスターの場合は、非自明なボディを .aidp/payloads/ に保存してから実行前に確認してください（references/payloads.md 参照）。

フォールバック（CLI 未インストール時）— oci raw-request エンドポイント: https://aidp.<region>.oci.oraclecloud.com/20240831/dataLakes/<DATALAKE_OCID>/…

# クラスター一覧（検証済み GET）
oci raw-request --http-method GET \
  --target-uri "https://aidp.us-ashburn-1.oci.oraclecloud.com/20240831/dataLakes/<OCID>/workspaces/<WS>/clusters" \
  --profile DEFAULT

# クラスター詳細 — 状態、設定、接続情報、インストール済みライブラリを含む
oci raw-request --http-method GET \
  --target-uri "https://aidp.us-ashburn-1.oci.oraclecloud.com/20240831/dataLakes/<OCID>/workspaces/<WS>/clusters/<KEY>" \
  --profile DEFAULT

# 起動（POST アクション）— JSON ボディが必須（`{}` を使用）; 空ボディは 400 エラー
oci raw-request --http-method POST \
  --target-uri "https://aidp.us-ashburn-1.oci.oraclecloud.com/20240831/dataLakes/<OCID>/workspaces/<WS>/clusters/<KEY>/actions/start" \
  --request-body '{}' --request-headers '{"content-type":"application/json"}' \
  --profile DEFAULT

パターン

コンピュート検索: aidp cluster list（REST: GET /clusters）で DataLake のクラスターを一覧表示します。 REST フォールバックで GET /workspaces/<ws>/clusters を使うと特定のワークスペースに絞り込めます。ワークスペースをまたぐ問い合わせでは適切な <ws> を指定してください — デフォルトワークスペース一つに依存しないよう注意してください。
ステータスと稼働確認: aidp cluster get --cluster-key <key>（REST: GET /workspaces/<ws>/clusters/<key>）で state（例: STARTING → ACTIVE）、stateDetails、設定、接続情報、アタッチされたノートブック/セッション情報を取得できます。データ系/SQL 系スキルはこれを最初に呼び出し、停止中であれば start を実行してから ACTIVE になるまで state をポーリングしてください（起動には数分かかります）。
ライフサイクル操作: aidp cluster start|stop|restart（REST: …/actions/start|stop|restart）。非同期の 202 レスポンスは終端状態になるまでポーリングします。共有クラスターを停止する前は必ず確認してください。
ライブラリ: aidp cluster list-libraries --cluster-key <key> で確認します。 REST フォールバックの場合、インストール済みの JAR と Python ライブラリは クラスターの GET レスポンス内 に含まれます。コネクターやライブラリを使用する前に必ず確認してください。

注意事項

REST フォールバックのみ — actions/start|stop|restart にはボディ {} が必須（2026-06-09 本番検証済み）: ボディなしで呼び出すと 400 InvalidParameter: The request body must not be null が返ります。 --request-body '{}' を指定すると 202 が返ります。これが元々の「start 400」の原因であり、ワークスペースの不一致ではありません。 aidp CLI はこのボディを自動的に設定します。すでに STARTING 状態のクラスターに対して start を再度実行すると、どちらのエンジンでも 409 Conflict（想定どおり）が返ります。
REST のアクション URL にはクラスターのホームワークスペースを使用してください。 GET /workspaces/<ws>/clusters で確認します（クラスターがデフォルトのワークスペースに属していない場合があります）。 CLI はクラスターキーからワークスペースを自動解決します。
oci-raw-request.md の「情報を捏造しない」ルールに従い、 rest-endpoint-map.md に本番での 2xx（または記録済みの 4xx）が記録されるまで、エンドポイント / バージョン / プレフィックスを確認済みとして提示しないでください。

クラスターのプロビジョニング / スケール（`aidp cluster create|update|delete`; REST: `POST/PUT/DELETE …/workspaces/<ws>/clusters`）

本番検証済みの create ボディ（これにより agent_e2e_cluster が ACTIVE にプロビジョニングされました — 2026-06-10）:

{ "type": "USER", "displayName": "etl_cluster",
  "driverConfig": { "driverShape": "amd.generic", "driverShapeConfig": { "ocpus": 2, "memoryInGBs": 16, "gpus": 0 } },
  "workerConfig": { "minWorkerCount": 1, "maxWorkerCount": 1, "workerShape": "amd.generic",
                    "workerShapeConfig": { "ocpus": 2, "memoryInGBs": 16, "gpus": 0 } },
  "clusterRuntimeConfig": { "sparkVersion": "3.5.0", "type": "SPARK", "initScripts": [] },
  "autoTerminationMinutes": 120 }

displayName の文字セット: 先頭は英字 でなければなりません。使用できる特殊文字は アンダースコアとスラッシュのみ。ハイフン（例: etl-cluster）は 400 InvalidParameter（「アンダースコアとスラッシュ以外の特殊文字は使用不可」）になります — etl_cluster を使用してください。
シェイプ: AMD / ARM / Intel / NVIDIA GPU（プラットフォームリファレンス §12）。 Quickstart = ドライバー 1 台 + ワーカー 10 台以下、AMD 2 OCPU/32 GB、オートスケール（高速起動）。 Custom = フル制御。 Quickstart クラスターにカスタムライブラリをインストールすると Custom に変換されます。
スケール: 静的（minWorkerCount == maxWorkerCount）または オートスケール（min < max）。
稼働時間: 常時稼働（省略）または autoTerminationMinutes による アイドルタイムアウト。
ランタイム: Spark 3.5.0 / Delta 3.2.0 / Python 3.11 / Java 17（Python + SQL のユーザーコードのみ）。

ライブラリのインストール — `aidp cluster patch-library`

対応形式: .jar / .whl / requirements.txt。ソース: ワークスペース / ボリューム / アップロードしたファイル。 インストール後はクラスターの再起動が必要です。 ノートブックスコープのインストール（!pip install …、.ipynb のみ）は再起動不要ですが、そのノートブックにのみ適用されます（aidp-notebooks 参照）。

GPU / RAPIDS クラスター（プラットフォームリファレンス §14）

GPU シェイプ: 1 GPU = 15 OCPU / 24 GB GPU メモリ; 2 GPU = 30 OCPU / 48 GB。 ルール: ドライバーとワーカーの両方が NVIDIA GPU でなければなりません — CPU と GPU の混在は不可。

必須の RAPIDS Spark 設定:

spark.plugins=com.nvidia.spark.SQLPlugin
spark.shuffle.manager=com.nvidia.spark.rapids.spark350.RapidsShuffleManager
spark.rapids.shuffle.mode=MULTITHREADED
spark.executor.resource.gpu.amount=1
spark.task.resource.gpu.amount=1/executor.cores

ライブラリ: Spark RAPIDS、Spark RAPIDS ML（cuML）

AI コンピュート（プレビュー）— agent フローを動作させるコンピュート

agent フロー（aidp-agent-flows / aidp-agent-highcode）向けの専用コンピュートです。 1〜64 OCPU で、PvtDefaultWorkspace のみに制限されます。プライベートワークスペースはプライベートの Autonomous AI Lakehouse（リンク後は変更不可）への接続が必要です。パブリックワークスペースはプライベートの ALH を使用できません（AI 機能は使用不可）。 Workspace > Create > AI Compute から作成します。 Start/Stop によりコンピュートのメータリングが開始/解放されます。アタッチされたフローはクラスターの「Agent flows」タブに表示されます。

外部 BI ツールとの接続（JDBC / ODBC）— OAC/FDI の代替ではなく補完

クラスターの「Connection Details」タブに、DBeaver / Tableau / Power BI 向けの simbaSpark JDBC および ODBC ドライバーが用意されています。

JDBC ドライバークラス: com.simba.spark.jdbc.Driver
JDBC URL: そのタブに記載の URL を使用
認証: トークンベース（URL 内に ociProfile なし → ブラウザ SSO）または API キー（ociProfile=<profile_name> を末尾に付加）

OAC の接続設定自体は OAC 側の作業です（OAC/fusion-bundle プラグインが担当）。このスキルでは AIDP 側の

原文（English）を表示

`aidp-cluster-ops` — cluster lifecycle & libraries

Inspect and control AIDP Spark clusters. Most data skills depend on a RUNNING cluster, so this is the common pre-step. This is a control-plane skill. No MCP and no ai-data-engineer-agent repo are required.

When to use

"What clusters are there / is X running?", "start/stop/restart the cluster", "what libraries are installed", "check if compute is up before running data work".

Engine — official `aidp` CLI (control-plane)

Preferred engine is the official Oracle aidp CLI; oci raw-request is the fallback when the CLI isn't installed. Both hit the same data-plane REST API with the same auth — see references/aidp-cli-map.md for the full command map, references/oci-raw-request.md for base URL + auth ladder + async/error conventions, and references/no-mcp-rest-map.md for REST endpoint shapes.

Op	CLI (preferred)	REST fallback
List clusters	`aidp cluster list`	`GET /clusters` — or `GET /workspaces/<ws>/clusters`
Status / config	`aidp cluster get --cluster-key <key>`	`GET /workspaces/<ws>/clusters/<key>`
Default cluster	`aidp cluster get-default`	(in list output)
Libraries	`aidp cluster list-libraries --cluster-key <key>` (`patch-library` to add/remove)	inside cluster GET
Start / Stop / Restart	`aidp cluster start\|stop\|restart --cluster-key <key>`	`POST /workspaces/<ws>/clusters/<key>/actions/start\|stop\|restart`
Logs / metrics	`aidp cluster search-logs\|download-logs\|summarize-metrics-data --cluster-key <key>`	`…/clusters/<key>/…`

All CLI calls take --instance-id <DATALAKE_OCID> --auth api_key --profile DEFAULT --region <r>.

# CLI (preferred): list clusters
aidp cluster list --instance-id <DATALAKE_OCID> --auth api_key --profile DEFAULT --region us-ashburn-1

# CLI (preferred): cluster detail — state, config, connections
aidp cluster get --cluster-key <KEY> --instance-id <DATALAKE_OCID> --auth api_key --profile DEFAULT --region us-ashburn-1

# CLI (preferred): start (the CLI handles the required body for you)
aidp cluster start --cluster-key <KEY> --instance-id <DATALAKE_OCID> --auth api_key --profile DEFAULT --region us-ashburn-1

Mutating ops (start/stop/restart, patch-library) — for shared clusters, persist any non-trivial body to .aidp/payloads/ and confirm first (references/payloads.md).

Fallback (no CLI installed) — oci raw-request against https://aidp.<region>.oci.oraclecloud.com/20240831/dataLakes/<DATALAKE_OCID>/…:

# List clusters (verified GET)
oci raw-request --http-method GET \
  --target-uri "https://aidp.us-ashburn-1.oci.oraclecloud.com/20240831/dataLakes/<OCID>/workspaces/<WS>/clusters" \
  --profile DEFAULT

# Cluster detail — state, config, connections, AND installed libraries
oci raw-request --http-method GET \
  --target-uri "https://aidp.us-ashburn-1.oci.oraclecloud.com/20240831/dataLakes/<OCID>/workspaces/<WS>/clusters/<KEY>" \
  --profile DEFAULT

# Start (POST action) — a JSON body is REQUIRED (use {}); empty body 400s
oci raw-request --http-method POST \
  --target-uri "https://aidp.us-ashburn-1.oci.oraclecloud.com/20240831/dataLakes/<OCID>/workspaces/<WS>/clusters/<KEY>/actions/start" \
  --request-body '{}' --request-headers '{"content-type":"application/json"}' \
  --profile DEFAULT

Patterns

Find compute: aidp cluster list (REST GET /clusters) lists DataLake clusters; GET /workspaces/<ws>/clusters scopes the REST fallback to one workspace. Cross-workspace questions must pass the right <ws> — don't rely on a single default workspace.
Status + readiness: aidp cluster get --cluster-key <key> (REST GET /workspaces/<ws>/clusters/<key>) returns state (e.g. STARTING → ACTIVE) + stateDetails, config, connections, and attached notebooks/sessions. Data/SQL skills should call this first and start if stopped, then re-check, polling state until ACTIVE (start takes minutes).
Lifecycle: aidp cluster start|stop|restart (REST …/actions/start|stop|restart). The async 202 is poll-to-terminal. Confirm before stopping a shared cluster.
Libraries: aidp cluster list-libraries --cluster-key <key>; with the REST fallback the installed JARs + Python libs come back inside the cluster GET. Check before relying on a connector/lib.

Caveats

REST fallback only — actions/start|stop|restart need a body {} (LIVE-VERIFIED 2026-06-09): calling with no body returns 400 InvalidParameter: The request body must not be null; passing --request-body '{}' returns 202. This (not workspace mismatch) was the original "start 400". The aidp CLI sets this body for you. A second start while already STARTING returns 409 Conflict (expected) on either engine.
Use the cluster's home workspace in the REST action URL — find it via GET /workspaces/<ws>/clusters (a cluster may not live in your default workspace). The CLI resolves the workspace from the cluster key.
Per the no-fabrication gate in oci-raw-request.md: don't present an endpoint/version/prefix as confirmed until a live 2xx (or documented 4xx) is recorded in rest-endpoint-map.md.

Provision / scale a cluster (`aidp cluster create|update|delete`; REST `POST/PUT/DELETE …/workspaces/<ws>/clusters`)

Live-verified create body (this provisioned agent_e2e_cluster → ACTIVE, 2026-06-10):

{ "type": "USER", "displayName": "etl_cluster",
  "driverConfig": { "driverShape": "amd.generic", "driverShapeConfig": { "ocpus": 2, "memoryInGBs": 16, "gpus": 0 } },
  "workerConfig": { "minWorkerCount": 1, "maxWorkerCount": 1, "workerShape": "amd.generic",
                    "workerShapeConfig": { "ocpus": 2, "memoryInGBs": 16, "gpus": 0 } },
  "clusterRuntimeConfig": { "sparkVersion": "3.5.0", "type": "SPARK", "initScripts": [] },
  "autoTerminationMinutes": 120 }

displayName charset: must start with a letter; the only special chars allowed are underscore and slash. A hyphen (e.g. etl-cluster) → 400 InvalidParameter ("no special characters … except for underscore, slash") — use etl_cluster.
Shapes: AMD / ARM / Intel / NVIDIA GPU (platform-ref §12). Quickstart = 1 driver + ≤10 workers, AMD 2 OCPU/32 GB, autoscale (fast start); Custom = full control. Installing custom libs to a Quickstart cluster converts it to Custom.
Scale: static (minWorkerCount == maxWorkerCount) or autoscale (min < max).
Run duration: always-on (omit) or idle timeout via autoTerminationMinutes.
Runtime: Spark 3.5.0 / Delta 3.2.0 / Python 3.11 / Java 17 (Python + SQL user code only).

Libraries (install) — `aidp cluster patch-library`

Formats .jar / .whl / requirements.txt; source = workspace / volume / uploaded file. Must restart the cluster after installing. Notebook-scoped installs (!pip install …, .ipynb only) don't need a restart but apply only to that notebook (see aidp-notebooks).

GPU / RAPIDS clusters (platform-ref §14)

GPU shapes: 1 GPU = 15 OCPU/24 GB GPU mem; 2 GPU = 30 OCPU/48 GB. Rule: both driver AND worker must be NVIDIA GPU — no CPU/GPU mixing. Required RAPIDS Spark configs: spark.plugins=com.nvidia.spark.SQLPlugin, spark.shuffle.manager=com.nvidia.spark.rapids.spark350.RapidsShuffleManager, spark.rapids.shuffle.mode=MULTITHREADED, spark.executor.resource.gpu.amount=1, spark.task.resource.gpu.amount=1/executor.cores. Libraries: Spark RAPIDS, Spark RAPIDS ML (cuML).

AI Compute (Preview) — powers agent flows

Specialized compute for agent flows (aidp-agent-flows / aidp-agent-highcode): 1–64 OCPU, restricted to PvtDefaultWorkspace; private workspaces must connect to a private Autonomous AI Lakehouse (immutable once linked); public workspaces can't use private ALH (AI features unavailable). Create via Workspace > Create > AI Compute; Start/Stop frees/meters compute; attached flows show on the cluster's Agent flows tab.

Connect external BI tools (JDBC / ODBC) — additive to OAC/FDI, not a replacement

The cluster Connection Details tab provides the simbaSpark JDBC and ODBC drivers for DBeaver / Tableau / Power BI. JDBC driver class com.simba.spark.jdbc.Driver; use the JDBC URL from that tab. Auth: token-based (no ociProfile in the URL → browser SSO) or API key (append ociProfile=<profile_name>). OAC connection setup itself is OAC-side (the OAC/fusion-bundle plugin); here we only expose the AIDP driver/URL.

Notes

For Spark job/stage/task diagnostics on a running cluster, use aidp-spark-debugging.
Optional accelerator: if an aidp MCP happens to be configured, its list_clusters / get_cluster_status / get_default_cluster / start_cluster / stop_cluster / restart_cluster / list_cluster_libraries tools wrap these same REST calls. The MCP is not required — the oci raw-request calls above are the source of truth.

References

references/aidp-cli-map.md — skill → official aidp CLI command map (primary engine)
references/oci-raw-request.md — base URL, auth ladder, async/errors
references/no-mcp-rest-map.md — cluster endpoint map + start-400 note
references/rest-endpoint-map.md — verification ledger

原文・著作権は Anthropic および各プラグイン作者に帰属します。日本語訳は Claude API による自動翻訳です。