スキルOfficialdevelopment

🧠output-dev-model-selection

プラグイン: outputai
ソース: GitHub で見る ↗

説明

出力SDK用のプロンプトファイルに最適なLLMモデルを選択します。次のような場合に使用: 新しい`.prompt`ファイルを作成するとき、モデルの選択を見直すとき、または古くなったモデルをアップグレードするとき。優先度（推論能力・バランス・速度・コスト）、プロバイダーの選定、およびVercel AI Gatewayモデルインデックスへのライブ照会を順を追って案内します。

原文を表示

Pick the right LLM model for an Output SDK prompt file. Use when writing a new .prompt file, reviewing a model choice, or upgrading a stale model. Walks through priority (reasoning/balance/speed/cost), provider selection, and a live lookup against the Vercel AI Gateway model index.

ユースケース

✓新しい.promptファイルを作成するとき
✓モデルの選択を見直すとき
✓古くなったモデルをアップグレードするとき

本文（日本語訳）

Output SDK プロンプト用モデルの選定

このスキルは、Output SDK のスキルおよびエージェント全体におけるモデル選定の唯一の正式な情報源です。モデルのラインナップはドキュメントよりも速く変化するため、他のスキルは特定のモデル ID をハードコードせず、このスキルにリンクしています。

ライブモデルスナップショット

スキルのロード時に以下を実行し、Vercel AI Gateway からプロバイダーごとに最近リリースされた上位 10 モデルを取得します:

output=$(curl -fsS https://ai-gateway.vercel.sh/v1/models 2>/dev/null | jq '
  .data as $models
  | {
      anthropic: ([ $models[] | select(.id | startswith("anthropic/")) ] | sort_by(.released) | reverse | .[0:10]),
      openai:    ([ $models[] | select(.id | startswith("openai/"))    ] | sort_by(.released) | reverse | .[0:10]),
      google:    ([ $models[] | select(.id | startswith("google/"))    ] | sort_by(.released) | reverse | .[0:10])
    }
' 2>/dev/null)
if [ -n "$output" ]; then printf '%s\n' "$output"; else echo "(snapshot unavailable)"; fi

スナップショットデータ

output=$(curl -fsS https://ai-gateway.vercel.sh/v1/models 2>/dev/null | jq '
  .data as $models
  | {
      anthropic: ([ $models[] | select(.id | startswith("anthropic/")) ] | sort_by(.released) | reverse | .[0:10]),
      openai:    ([ $models[] | select(.id | startswith("openai/"))    ] | sort_by(.released) | reverse | .[0:10]),
      google:    ([ $models[] | select(.id | startswith("google/"))    ] | sort_by(.released) | reverse | .[0:10])
    }
' 2>/dev/null)
if [ -n "$output" ]; then printf '%s\n' "$output"; else echo "(snapshot unavailable)"; fi

スナップショット取得失敗時のフォールバック

上記のブロックが空の場合、スクリプトが自動実行されなかったことを意味します。原因として、jq・curl・ネットワークアクセスのいずれかが利用できない可能性があります。処理を続行する前に、自分でスナップショットをクエリしてフィルタリングしてください。

スナップショットの構造

{
  "anthropic": [ <model>, ..., <最大10件> ],
  "openai":    [ <model>, ..., <最大10件> ],
  "google":    [ <model>, ..., <最大10件> ]
}

各 <model> は Gateway のペイロードをそのまま含みます。モデルごとの主要フィールドは以下のとおりです:

フィールド	用途
`id`	プロバイダープレフィックス付きの ID（例: `anthropic/claude-sonnet-4.6`）— ステップ 5 でプロンプトファイル形式に変換する
`released`	リリース日時の Unix タイムスタンプ。スナップショットはプロバイダーごとに新しい順にソート済み
`name`	人間が読める名称
`description`	機能の概要説明（1 段落）— 同系統の名前を持つモデルを比較する際に参照する
`context_window`	入力トークンの最大数。プロンプトに大規模なコンテキスト（コードベースや長文ドキュメントなど）を含む場合に重要
`max_tokens`	1 回のレスポンスで出力できるトークンの最大数
`tags`	機能フラグ。`reasoning`、`tool-use`、`vision`、`file-input`、`web-search`、`image-generation`、`explicit-caching`、`implicit-caching`
`pricing.input` / `pricing.output`	トークンあたりのコスト（USD）。1,000,000 を乗算すると「100 万トークンあたり」の単価になる
`pricing.input_cache_read`	キャッシュ入力時の価格 — 通常 `input` の約 10 分の 1
`type`	チャットモデルは `language`。画像モデルは `image-generation` として表示され、`.prompt` ファイルには使用不可

選定フロー

ステップ 1 — タスクの優先事項を決定する

最初に条件に合う行を選択します。判断が難しい場合は reasoning をデフォルトとしてください。

優先事項	次のような場合に使用
reasoning（デフォルト）	複雑な多段階ロジック、構造化出力の抽出、エッジケースを含むジャッジ処理、誤りが遅さより問題になるあらゆる場合
balance	要約・分類・コンテンツ生成・会話など、ほとんどの生成タスク
speed	短いインタラクティブなレスポンス、低レイテンシーの UI ループ、単純な変換処理
cost	トークンコストが支配的なバルクバッチ処理で、品質の下限が許容範囲内である場合

ステップ 2 — プロバイダーを決定する

ワークフロー内の既存の *.prompt ファイル（および src/workflows/ 配下の兄弟ワークフロー）をスキャンし、provider: の宣言を集計します。

ワークフロー（または兄弟ワークフロー）がすでに 1 つのプロバイダーを使用している場合は、それに合わせてください。 プロバイダーを混在させると、ランタイムが各プロバイダーの API キーを必要とし、運用上の問題が生じます。
既存のプロンプトがない場合は、anthropic をデフォルトとしてください。
プロバイダーの切り替えは、ユーザーが明示的に要求した場合、または必要な機能（例: Gemini の useSearchGrounding、OpenAI の maxToolCalls）がプロバイダー固有である場合のみ行ってください。

ステップ 3 — プロバイダー名をスナップショットキーにマッピングする

Vercel が Gemini を google/ にグループ化しているため、Output SDK の provider: の値がスナップショットのキーと一致しない場合があります:

Output SDK provider	スナップショットキー
`anthropic`	`anthropic`
`openai`	`openai`
`vertex`（Gemini モデル）	`google`
`vertex`（Claude モデル）	`anthropic`（その後、`@vertex` サフィックスを手動で再付加）
`bedrock`	`anthropic`（その後、bedrock の名前空間に手動で変換）

ステップ 4 — プロバイダーのリストからモデルを選ぶ

リストはすでに新しい順にソートされています。上から順に確認し、優先事項のティアに合致する最初のモデルを選んでください。

デフォルトでスキップするもの:

type != "language" のモデル（例: gpt-image-2、gemini-embedding-2）— .prompt ファイルには使用不可。
preview・alpha・beta を含む ID。より新しい preview / alpha / beta が存在する場合でも、安定版 / GA モデルのみを使用してください。 非安定版モデルを選ぶのは、ユーザーが明示的に要求した場合（「preview を使って」「新しい beta を使いたい」など）のみです。

優先事項	Anthropic — `id` に含まれる文字列	OpenAI — `id` の条件	Google — `id` の条件
reasoning	`claude-opus-`（かつ `tags` に `reasoning` を含む）	`-pro` で終わる	`-pro` を含む
balance	`claude-sonnet-`	ベースの `gpt-N.M`（`-mini` / `-nano` / `-pro` サフィックスなし）	`-pro` を含む
speed	`claude-haiku-`	`-mini` で終わる	`-flash` で終わる（`-flash-lite` は除く）
cost	`claude-haiku-`	`-nano` で終わる	`-flash-lite` を含む

複数の安定版モデルが条件に合致する場合のタイブレーク:

再現性が必要な場合（例: eval ジャッジ）を除き、日付入りスナップショット（claude-sonnet-4-20250514）よりもバージョン非固定のエイリアス（claude-sonnet-4.6）を優先してください。
完全に同等な行が 2 つある場合は、context_window が大きい方を選び、それでも同じなら pricing.input が低い方を選んでください。

スナップショット内の すべての 合致モデルが preview / alpha / beta である場合 — つまりティア全体がプレリリース段階にある場合 — ユーザーにその旨を伝え、確認を取ってから選んでください。利用可能なものが preview しかなかったとしても、黙って使用しないでください。

ステップ 5 — Gateway の ID をプロンプトファイルのモデル文字列に変換する

Gateway の ID にはプロバイダープレフィックスが付いており、ドットを使用します。プロンプトファイルの ID ではプレフィックスを除去し、ハイフンを使用します。次の 2 つの変換を適用します: 最初の / までの文字列（/ を含む）を削除し、. を - に置換します。

Gateway `id`	プロンプトファイルの `model:`
`anthropic/claude-sonnet-4.6`	`claude-sonnet-4-6`
`openai/gpt-5.5`	`gpt-5-5`
`google/gemini-3-flash`	`gemini-3-flash`

変換後の文字列を .prompt ファイルのフロントマターに記述します:

---
provider: anthropic
model: claude-sonnet-4-6
temperature: 0.7
maxTokens: 4096
---

Picking a Model for an Output SDK Prompt

This skill is the single source of truth for model selection across Output SDK skills and agents. Other skills link here instead of pinning specific model IDs, because model rosters drift faster than docs.

Live model snapshot

We run this at skill-load time to fetch the 10 most recently released models per provider from the Vercel AI Gateway:

output=$(curl -fsS https://ai-gateway.vercel.sh/v1/models 2>/dev/null | jq '
  .data as $models
  | {
      anthropic: ([ $models[] | select(.id | startswith("anthropic/")) ] | sort_by(.released) | reverse | .[0:10]),
      openai:    ([ $models[] | select(.id | startswith("openai/"))    ] | sort_by(.released) | reverse | .[0:10]),
      google:    ([ $models[] | select(.id | startswith("google/"))    ] | sort_by(.released) | reverse | .[0:10])
    }
' 2>/dev/null)
if [ -n "$output" ]; then printf '%s\n' "$output"; else echo "(snapshot unavailable)"; fi

Snapshot Data

output=$(curl -fsS https://ai-gateway.vercel.sh/v1/models 2>/dev/null | jq '
  .data as $models
  | {
      anthropic: ([ $models[] | select(.id | startswith("anthropic/")) ] | sort_by(.released) | reverse | .[0:10]),
      openai:    ([ $models[] | select(.id | startswith("openai/"))    ] | sort_by(.released) | reverse | .[0:10]),
      google:    ([ $models[] | select(.id | startswith("google/"))    ] | sort_by(.released) | reverse | .[0:10])
    }
' 2>/dev/null)
if [ -n "$output" ]; then printf '%s\n' "$output"; else echo "(snapshot unavailable)"; fi

Snapshot Fallback

If the block above is empty, the script didn't execute automatically — likely because part of it (jq, curl, or network access) is missing. Query and filter the snapshot yourself before continuing.

Snapshot shape

{
  "anthropic": [ <model>, ..., <up to 10> ],
  "openai":    [ <model>, ..., <up to 10> ],
  "google":    [ <model>, ..., <up to 10> ]
}

Each <model> is the unmodified gateway payload. Useful fields per model:

Field	What to use it for
`id`	The provider-prefixed ID (eg `anthropic/claude-sonnet-4.6`) — translate to prompt-file form (Step 5)
`released`	Unix timestamp of release. Snapshot is already sorted newest-first per provider.
`name`	Human-readable name
`description`	One-paragraph capability summary — read this when comparing similarly-named tiers
`context_window`	Max input tokens. Matters when prompts include large context (codebases, long docs)
`max_tokens`	Max single-response output tokens
`tags`	Capability flags. `reasoning`, `tool-use`, `vision`, `file-input`, `web-search`, `image-generation`, `explicit-caching`, `implicit-caching`
`pricing.input` / `pricing.output`	Per-token cost (USD). Multiply by 1,000,000 for "per 1M tokens"
`pricing.input_cache_read`	Cached-input price — usually 10× cheaper than `input`
`type`	`language` for chat models; image models surface as `image-generation` and aren't valid for `.prompt` files

Decision flow

Step 1 — Determine task priority

Pick the first row that fits. If unclear, default to reasoning.

Priority	Use when
reasoning (default)	Complex multi-step logic, structured output extraction, judges with edge cases, anything where wrong > slow
balance	Most generative work — summarization, classification, content drafting, conversation
speed	Short interactive responses, low-latency UI loops, simple transforms
cost	Bulk batch processing where token spend dominates and quality floor is forgiving

Step 2 — Determine provider

Scan existing *.prompt files in the workflow (and its siblings under src/workflows/) and tally what provider: they declare.

If the workflow (or sibling workflows) already use one provider, match it. Mixing providers means the runtime needs API keys for each — operational footgun.
If no existing prompts, default to anthropic.
Only switch provider when the user explicitly asks, or when a feature you need (eg Gemini's useSearchGrounding, OpenAI's maxToolCalls) is provider-specific.

Step 3 — Map provider name to snapshot key

Output SDK provider: values don't always line up with the snapshot keys, since Vercel groups Gemini under google/:

Output SDK provider	Snapshot key
`anthropic`	`anthropic`
`openai`	`openai`
`vertex` (Gemini models)	`google`
`vertex` (Claude models)	`anthropic` (then re-add the `@vertex` suffix manually)
`bedrock`	`anthropic` (then translate to bedrock namespace manually)

Step 4 — Pick a model from the provider's list

The list is already sorted newest-first. Walk it top-down and pick the first model whose id matches the tier for your priority.

Skip these by default:

type != "language" (eg gpt-image-2, gemini-embedding-2) — not valid for .prompt files.
IDs containing preview, alpha, or beta. Use stable / GA models only, even if a newer preview/alpha/beta exists. Only pick a non-stable model when the user explicitly asks for it ("use the preview", "I want the new beta", etc.).

Priority	Anthropic — match `id` containing	OpenAI — match `id`	Google — match `id`
reasoning	`claude-opus-` (and `tags` includes `reasoning`)	ends with `-pro`	contains `-pro`
balance	`claude-sonnet-`	base `gpt-N.M` (no `-mini`/`-nano`/`-pro` suffix)	contains `-pro`
speed	`claude-haiku-`	ends with `-mini`	ends with `-flash` (not `-flash-lite`)
cost	`claude-haiku-`	ends with `-nano`	contains `-flash-lite`

Tie-breakers when multiple stable models match:

Prefer the unversioned alias (claude-sonnet-4.6) over a dated snapshot (claude-sonnet-4-20250514) unless reproducibility is required (eg eval judges).
If two truly equivalent rows exist, take the one with the larger context_window, then lower pricing.input.

If every match in the snapshot is a preview/alpha/beta — meaning the entire tier is in pre-release — surface that to the user and ask before picking one. Don't silently use a preview because it was the only thing available.

Step 5 — Translate the gateway ID into a prompt-file model string

Gateway IDs carry a provider prefix and use dots; prompt-file IDs strip the prefix and use hyphens. Apply two transformations: drop everything up to and including the first /, then replace . with -.

Gateway `id`	Prompt-file `model:`
`anthropic/claude-sonnet-4.6`	`claude-sonnet-4-6`
`openai/gpt-5.5`	`gpt-5-5`
`google/gemini-3-flash`	`gemini-3-flash`

Drop the translated string into your .prompt frontmatter:

---
provider: anthropic
model: claude-sonnet-4-6
temperature: 0.7
maxTokens: 4096
---