スキルOfficialdevelopment

🔌aidp-rest-generic

プラグイン: oracle-ai-data-platform-workbench-spark-connectors
ソース: GitHub で見る ↗

説明

AIDPの`aidataplatform` Generic RESTコネクタを使用して、任意のREST APIからSpark DataFrameにデータを取得します。次のような場合に使用: ユーザーがFusion / EPM / Essbase以外のRESTエンドポイントを持ち、スキーマを記述した`manifest.url`が存在する場合。認証はHTTPベーシック認証を使用し、派生プロパティによってクエリパラメータが制御されます。

原文を表示

Pull data from any REST API into a Spark DataFrame using the AIDP `aidataplatform` Generic REST connector. Use when the user has a non-Fusion / non-EPM / non-Essbase REST endpoint with a `manifest.url` describing the schema. Auth is HTTP Basic with derived properties driving query parameters.

ユースケース

✓manifest.urlでスキーマを記述してデータ読込
✓HTTPベーシック認証でRESTAPIに接続

本文（日本語訳）

`aidp-rest-generic` — AIDP `aidataplatform` 経由の汎用 REST (`type=GENERIC_REST`)

任意の REST API を Spark DataFrame として読み込むコネクタです。カスタム統合なしにレスポンスをパースできるよう、サーバー側で公開された マニフェスト（各 API エンドポイント・パラメータ・レスポンススキーマを記述した小さな JSON）が必要です。

次のような場合に使用:

マニフェスト URL を公開している任意の REST エンドポイント（カスタムエンタープライズ API など）。
キーワード: 「Generic REST」「manifest URL」「REST コネクタ」。

使用しない場合:

Fusion ERP / HCM / SCM の REST → aidp-fusion-rest を使用。構造が異なります（マニフェスト不要、ページあたり ≤499 件のページング）。
Fusion BICC のバルクエクストラクト → aidp-fusion-bicc を使用。
EPM Cloud Planning → aidp-epm-cloud を使用。
Essbase → aidp-essbase を使用。

読み込み

import os
from oracle_ai_data_platform_connectors.aidataplatform import (
    AIDP_FORMAT, aidataplatform_options,
)

opts = aidataplatform_options(
    type="GENERIC_REST",
    user=os.environ["REST_USER"],
    password=os.environ["REST_PASSWORD"],
    schema=os.environ.get("REST_SCHEMA", "default"),
    extra={
        "base.url":     os.environ["REST_BASE_URL"],     # 例: http://api.internal/v1
        "manifest.url": os.environ["REST_MANIFEST_URL"], # 例: http://api.internal/v1/manifest
        "auth.type":    "basic",
        "api":          os.environ["REST_API"],          # 例: "getOrdersByOrderID"
        # derived.property.<name> の値は API 呼び出しに渡されます（複数指定可）:
        "derived.property.orderNo": os.environ.get("REST_ORDER_NO", "12345"),
    },
)
df = spark.read.format(AIDP_FORMAT).options(**opts).load()
df.show(5)

マニフェストの仕様

マニフェストには以下の内容を記述します:

apis — 名前付き API オペレーション（例: getOrdersByOrderID）
parameters — コネクタが送信すべき内容（パス / クエリ / ボディ）
responseSchema — コネクタが推論に使用する Spark スキーマ

マニフェスト URL が用意できない場合、このコネクタは動作しません。その場合は aidp-fusion-rest の requests ベースのパターンを参照し、対象 API に合わせて適用してください。

複数の derived プロパティ

それぞれを extra={} の別々のキーとして渡します:

extra={
    "base.url":     "...",
    "manifest.url": "...",
    "auth.type":    "basic",
    "api":          "searchOrders",
    "derived.property.fromDate":  "2025-01-01",
    "derived.property.toDate":    "2025-12-31",
    "derived.property.status":    "OPEN",
}

ワークスペース / Volume パスからのマニフェスト読み込み (`manifest.path`)

マニフェストが HTTP で配信されるのではなく、AIDP ワークスペースや Volume にアップロードされた静的ファイルである場合は、manifest.url の代わりに manifest.path を使用します。内容の形式は同じで、取得元だけが異なります。マニフェストを手動で作成した場合や、ノートブックと並べてバージョン固定で管理したい場合に便利です。

opts = aidataplatform_options(
    type="GENERIC_REST",
    user=os.environ["REST_USER"],
    password=os.environ["REST_PASSWORD"],
    schema="default",
    extra={
        "base.url":      os.environ["REST_BASE_URL"],
        "manifest.path": "/Volumes/myvol/manifests/orders_api.json",
        "auth.type":     "basic",
        "api":           "searchOrders",
        "derived.property.status": "OPEN",
    },
)
df = spark.read.format(AIDP_FORMAT).options(**opts).load()

指定できるパスの形式:

/Volumes/<catalog>/<schema>/<volume>/path/to/manifest.json（AIDP Volume）
/Workspace/Shared/.../manifest.json（ワークスペースファイル — 動作はしますが FUSE 経由のため不安定な場合あり）

Volume パスの使用を推奨します。

注意事項

auth.type=basic のみ対応。 OAuth / API キーヘッダー / mTLS を使用する API には対応していません。その場合は Python の requests を使用してください。
マニフェストは AIDP クラスタの VCN から到達可能である必要があります。 エグレス制限が適用されます。
schema オプションは、結果として得られる DataFrame の AIDP / Spark 論理スキーマであり、サーバー側のスキーマではありません。不明な場合は default を使用してください。
ページングはマニフェストの定義に基づいてコネクタが自動的に処理します。マニフェストに maxPageSize が宣言されている場合、コネクタは自動的にバッチ処理を行います。

参考情報

ヘルパー: scripts/oracle_ai_data_platform_connectors/aidataplatform.py
公式サンプル: oracle-samples/oracle-aidp-samples → data-engineering/ingestion/Read_Only_Ingestion_Connectors.ipynb

原文（English）を表示

`aidp-rest-generic` — Generic REST via AIDP `aidataplatform` (`type=GENERIC_REST`)

Read from arbitrary REST APIs as a Spark DataFrame. The connector requires a server-published manifest (a small JSON describing each API endpoint, parameters, and response schema) so it knows how to parse responses without a custom integration.

When to use

Any REST endpoint that exposes a manifest URL (custom enterprise APIs commonly do).
Mentioned: "Generic REST", "manifest URL", "REST connector".

When NOT to use

For Fusion ERP/HCM/SCM REST → aidp-fusion-rest. Different shape (no manifest; ≤499/page paging).
For Fusion BICC bulk extracts → aidp-fusion-bicc.
For EPM Cloud Planning → aidp-epm-cloud.
For Essbase → aidp-essbase.

Read

import os
from oracle_ai_data_platform_connectors.aidataplatform import (
    AIDP_FORMAT, aidataplatform_options,
)

opts = aidataplatform_options(
    type="GENERIC_REST",
    user=os.environ["REST_USER"],
    password=os.environ["REST_PASSWORD"],
    schema=os.environ.get("REST_SCHEMA", "default"),
    extra={
        "base.url":     os.environ["REST_BASE_URL"],   # e.g. http://api.internal/v1
        "manifest.url": os.environ["REST_MANIFEST_URL"], # e.g. http://api.internal/v1/manifest
        "auth.type":    "basic",
        "api":          os.environ["REST_API"],         # e.g. "getOrdersByOrderID"
        # Any number of derived.property.<name> values feed into the API call:
        "derived.property.orderNo": os.environ.get("REST_ORDER_NO", "12345"),
    },
)
df = spark.read.format(AIDP_FORMAT).options(**opts).load()
df.show(5)

Manifest contract

The manifest describes:

apis — the named API operations (e.g. getOrdersByOrderID)
parameters — what the connector should send (path/query/body)
responseSchema — the Spark schema the connector should infer

If you don't have a manifest URL, this connector won't work — fall back to the requests-based pattern in aidp-fusion-rest and adapt for your API.

Multiple derived properties

Pass each as a separate extra={} key:

extra={
    "base.url":     "...",
    "manifest.url": "...",
    "auth.type":    "basic",
    "api":          "searchOrders",
    "derived.property.fromDate":  "2025-01-01",
    "derived.property.toDate":    "2025-12-31",
    "derived.property.status":    "OPEN",
}

Manifest from a workspace / volume path (`manifest.path`)

If the manifest is a static file you've uploaded to your AIDP workspace or a Volume — instead of being served over HTTP — use manifest.path instead of manifest.url. Same shape, different source. Useful when the manifest is hand-authored or version-pinned alongside your notebook.

opts = aidataplatform_options(
    type="GENERIC_REST",
    user=os.environ["REST_USER"],
    password=os.environ["REST_PASSWORD"],
    schema="default",
    extra={
        "base.url":      os.environ["REST_BASE_URL"],
        "manifest.path": "/Volumes/myvol/manifests/orders_api.json",
        "auth.type":     "basic",
        "api":           "searchOrders",
        "derived.property.status": "OPEN",
    },
)
df = spark.read.format(AIDP_FORMAT).options(**opts).load()

The path can be:

/Volumes/<catalog>/<schema>/<volume>/path/to/manifest.json (AIDP Volume)
/Workspace/Shared/.../manifest.json (workspace file — works but FUSE-flaky)

Volume paths are the preferred location.

Gotchas

auth.type=basic only. If the API uses OAuth / API key headers / mTLS, this connector won't help — use the Python requests path.
Manifest must be reachable from the AIDP cluster's VCN. Egress restrictions apply.
Schema schema option is the AIDP/Spark logical schema for the resulting DataFrame, not a server-side one. Use default if unsure.
Paging is handled by the connector based on the manifest. If the manifest declares maxPageSize, the connector batches automatically.

References

原文・著作権は Anthropic および各プラグイン作者に帰属します。日本語訳は Claude API による自動翻訳です。