スキルOfficialproductivity

💻desktop-commander-overview

プラグイン: desktop-commander
ソース: GitHub で見る ↗

説明

次のような場合に使用: Desktop Commander MCP の機能が必要なとき — 永続的なシェルや REPL、長時間実行プロセス、ワークスペース外のファイルシステム、構造化ファイル（.xlsx、.docx、.pdf、画像）や CSV などの大規模なローカルデータファイル、大規模な ripgrep 検索、SSH、またはターン間での状態保持。

原文を表示

Use for Desktop Commander MCP capabilities — persistent shells and REPLs, long-running processes, filesystem beyond the workspace, structured files (.xlsx, .docx, .pdf, images) and large local data files such as CSVs, ripgrep search at scale, SSH, or cross-turn state.

ユースケース

✓永続的なシェルや REPL が必要なとき
✓長時間実行プロセスを扱うとき
✓ワークスペース外のファイルシステムにアクセスするとき
✓構造化ファイルや大規模データファイルを処理するとき
✓SSH や状態保持が必要なとき

本文（日本語訳）

Desktop Commander MCP

Desktop Commander は、ユーザーの実際のコンピュータ全体にわたる操作能力をagentに与えます。対象はファイル、フォルダ、ターミナル、プロセス、構造化ドキュメント、そしてSSH経由でアクセス可能なリモートマシンです。各ツールの詳細なスキーマ（パラメータ、戻り値の形式、フォーマット固有の動作）はMCP自体に定義されています。このskillは、それらのツールが何を実現できるか、そしてどのように組み合わせて一般的なワークフローを構成するかを説明します。

このMCPがagentに提供するもの

永続的なシェルセッション。 Desktop Commander は、起動したプロセスやセッションをツール呼び出しをまたいで維持します。単一の長期シェル、REPL、またはSSHセッションの中では、環境変数・作業ディレクトリ・有効化済みの仮想環境・オープン中の接続・REPL変数といった状態が保持されます。そのため agentは cd でディレクトリを移動し、venv を有効化した後、再セットアップなしに何ターンも後から同じセッションにコマンドやコードを送ることができます。（注意: start_process を別々に呼び出すと別々のセッションが開かれ、それらの間でシェルの状態は共有されません。永続性はあくまで1つのセッション内に限られます。）

長時間実行プロセス。 開発サーバー、ファイルウォッチャー、ビルド、学習の実行、テストスイートなどをバックグラウンドで起動したまま作業を続けられます。MCPはagentが複数ターンにわたってログ追跡・対話・終了操作を行えるプロセスハンドルを返します。長時間かかるコマンドであっても、フォアグラウンドのコマンド終了を待ってワークフローをブロックする必要がありません。

IDEワークスペース外のファイルシステムへのアクセス。 ユーザーがスコープを許可した場所であれば、Downloads・Documents・IDE外のプロジェクトフォルダ・その他の許可済みフォルダなど、どこでもファイルの読み取り・書き込み・移動・一覧表示・検査が可能です。整理・クリーンアップ作業、バッチでのドキュメント処理、「同僚が送ってきたファイルを見たい」といったIDEサンドボックスに収まらないリクエストに便利です。

既存ファイルへの精密な編集。 edit_block ツールは、安全機能を内蔵した完全一致の検索・置換を行います。一致箇所が曖昧な場合はサイレントに上書きするのではなく明示的にエラーとなり、expected_replacements（期待される置換件数）の指定によって部分一致による意図しない変更を防ぎます。読み込んだ一部のみを元にファイル全体を書き直す方法よりもデータ損失リスクは低くなります。ただし、old_string や expected_replacements が誤っているとコンテンツが壊れる可能性もあるため、編集完了とみなす前に変更後の内容を必ず確認してください。

MCPがネイティブに扱うバイナリ・構造化ファイル。 Excel・DOCX・PDFはファーストクラスのサポート対象であり、テキスト近似ではなくフォーマット固有のメカニズムで読み書きされます。Excelはセル範囲JSONで、DOCXは生XMLの編集で、PDFはページ単位の操作で新しい出力ファイルに書き出されます。結果は近似的な再生成物ではなく、元のフォーマットのままの本物のファイルです。画像とPDFはagentが閲覧可能なコンテンツとして返されます。

大規模な検索。 ripgrepをバックエンドとするストリーミング検索で、プロジェクト全体やフォルダツリー全体を対象にできます。agentはファイル名検索とファイル内コンテンツ検索を使い分け、コンテキストを溢れさせることなく結果を段階的にページングし、クエリが曖昧な場合は複数の検索を並行して実行します。

SSH経由のリモートマシン操作。 永続シェル内の長期SSHセッションにより、agentは実用的なオペレーションツールになります。一度接続すれば、毎回再接続することなく、複数ターンにわたってログの追跡・診断の実行・デプロイ・デバッグが行えます。

プロセス管理。 アクセス可能なプロセスの一覧表示・検査・ログ追跡・強制終了が可能です（OSの権限に依存）。前セッションから残った古い開発サーバーの後片付けや、CPU/メモリ問題の診断に役立ちます。

ワークフローの例

各例では実際のツール呼び出し順序を示します。以下の呼び出しは疑似コードの省略形（tool_name("arg", flag=value)）で記述されています。実際のツールはオブジェクト形式の引数を取ります。ツールの説明と完全なパラメータセットはMCP自体に定義されています。

「この本番障害をデバッグして」

本番環境に影響を与えるSSHコマンドを実行する前に、意図する操作を説明し、リスクが軽微でない場合はユーザーに確認を取ること。

start_process("ssh user@prod.example.com", timeout_ms=...) で長期SSHセッションを開き、PIDを返します。interact_with_process(pid, "tail -f /var/log/app.log\n") でログのストリーミングを開始します。以降のターンでは read_process_output(pid, offset=-50) で最新50行を取得し、interact_with_process(pid, "...") で同じセッション内に診断コマンドを送信します。完了後は force_terminate(pid) でセッションをクローズします。start_process で開いたセッションの終了には force_terminate が正しい方法であり、kill_process は list_processes で見つけた任意のOS PIDに対して使用するものです。

「これをステージングにデプロイして」

デプロイ・再起動・マイグレーション・その他の環境変更を伴うコマンドを実行する前に、ユーザーがまさにその操作を明示的に依頼していない限り、実行内容をまとめてユーザーに確認を取ること。

デプロイコマンド（スクリプト、SSHパイプ経由のコマンド、kubectl/gh など）に対して start_process を実行します。read_process_output で出力を追跡し、エラーを表面化します。デプロイ中に対話的な確認が必要な場合は interact_with_process(pid, "yes\n") を使用します。agentが完了またはロールバックを監視する間、セッションは維持されます。

「開発サーバーを起動してAPIを反復改善して」

start_process("npm run dev", timeout_ms=...) でサーバーを起動し続けます。agentは次のループを繰り返します: ルートファイルに edit_block を適用 → read_process_output(pid, offset=-30) でサーバーのリロードを確認 → start_process("curl -s http://localhost:3000/api/...") でワンショットテストを実行 → 繰り返し。コード変更のたびに開発サーバーを再起動する必要はありません。

「このモノレポ全体をリファクタリングして」

start_search(pattern="oldFunctionName", path=repo_root, searchType="content") で全呼び出し箇所を洗い出します。get_more_search_results(sessionId) でページングします。read_multiple_files(paths=[...]) で曖昧なヒットをコンテキスト付きで確認します。各箇所に edit_block(file_path, old_string, new_string) を適用し、同一ファイル内に同じ文字列が正当に複数回登場する場合は expected_replacements を設定します。旧名称で start_search を再実行し、get_more_search_results(sessionId) でページングを最後まで行って検索の完了を確認してください。ゼロヒットを確認できて初めてリファクタリングの完了とみなせます。

「このスプレッドシートのQ3の数値を更新して、レポートのサマリーも調整して」

read_file(path="/.../q3.xlsx", sheet="Revenue", range="A1:F50") で既存の数値をJSON 2次元配列として取得します。edit_block(file_path="/.../q3.xlsx", range="Revenue!C12:C24", content=[[12345], ...]) でセルをインプレース更新します。レポートについては、DOCXの編集は2回読み取りのフローです。まず read_file(path="/.../report.docx") （offset=0）でドキュメントのアウトライン（見出しと段落テキスト）を取得し、サマリーセクションを特定します。次に read_file(path="/.../report.docx", offset=N, length=...) を N > 0 で呼び出すと、該当セクション周辺の生XMLが返されます。0以外のoffsetを指定することがXMLモードへの切り替えになります。その出力からXMLフラグメントをコピーして old_string とし、edit_block(file_path, old_string, new_string) に書き換えたXMLを渡します。ユーザーには近似的な再生成物ではなく、本物の .xlsx と .docx ファイルが返ります。

「Q3レポートをPDFとして生成して」

Markdownコンテンツ（ヘッダー・テーブル・埋め込みHTMLによるチャート）を作成し、write_pdf を呼び出して新しいPDFファイルとしてレンダリングします。正確なパラメータとファイル名のルールはMCPの write_pdf ツール説明を参照してください。

「このPDFに表紙を挿入して」

write_pdf はoperations配列によって既存PDFの変更（ページの挿入・削除）もサポートしています。表紙の追加・セクションの削除・他ファイルからのコンテンツ結合など、既存PDFを編集して新しいPDFを生成する場合に使用します。operationの形式とパラメータのルールは write_pdf のツール説明を参照してください。

「この200MBのCSVを分析して」

start_process("python3 -i", timeout_ms=...) でPython REPLを開いてPIDを取得します。interact_with_process(pid, "import pandas as pd; df = pd.read_csv('/abs/path.csv')") で一度だけ読み込みます。以降のすべての質問（df.describe()、df.groupby('col').size()、チャートの描画など）は、すでに読み込み済みの同じREPLで実行されます。ライブラリの再インポートもデータフレームの再読み込みも発生しません。MCPはローカルデータファイルの分析全般においてこのワークフローを推奨しています。

「簡単なNodeスクリプトを実行して」

start_process("node:local", timeout_ms=...) でMCPサーバー上のステートレスNode実行モードを起動します（ESインポート対応）。start_process でランナーを開き、各JavaScriptコードは interact_with_process(pid, "<JSコード>") で送信されて独立して実行されます（呼び出し間で状態は共有されません）。長期REPLを維持するほどでもないワンショットの変換処理に適しています。コードを start_process のコマンド引数に入れようとしないでください。そこにはランナーの種類（node:local）のみを指定します。

「このコードベースを説明して」

list_directory(path=repo_root, depth=3) で全体の構造を把握します。start_search(pattern="export ", path=repo_root, searchType="content") で公開APIのサーフェスを探します。read_multiple_files(paths=[entrypoints]) で実際のコードを読みます。agentはユーザーに再確認することなく絞り込みを続けられます。

「Downloadsフォルダを整理して」

まずパスを絶対パスに解決します（例: ~/Downloads ではなく /Users/<user>/Downloads）。list_directory(path="/Users/<user>/Downloads", depth=1) でフォルダの中身を確認します。start_search(pattern="*.pdf", path="/Users/<user>/Downloads", searchType="files") などで種類別に検索します。create_directory で新しいフォルダを作成し、move_file でファイルを移動します。破壊的な操作を実行する前に移動計画をプレビューしてください。

「状況を教えて — 前のセッションでは何をしていたの？」

`get_recent_tool

原文（English）を表示

Desktop Commander MCP

Desktop Commander gives the agent reach across the user's actual computer — files, folders, terminals, processes, structured documents, and remote machines reachable over SSH. The tools' detailed schemas (parameters, return shapes, format-specific behavior) live in the MCP itself; this skill explains what they enable and how they compose into common workflows.

What this MCP gives the agent

Persistent shell sessions. Desktop Commander keeps a started process or session alive across tool calls. Inside a single long-lived shell, REPL, or SSH session, state carries forward — environment variables, working directory, activated virtualenvs, open connections, REPL variables — so the agent can cd, activate a venv, then send commands or code into that same session many turns later without re-setup. (Note: separate start_process calls open separate sessions and do not share shell state with each other; persistence is inside one session, not across them.)

Long-running processes. Start a dev server, watcher, build, training run, or test suite in the background and keep working. The MCP returns a process handle the agent can tail, interact with, or terminate across many turns. Long-running commands don't need to block the workflow waiting for a foreground command to exit.

Filesystem reach beyond the IDE workspace. Read, write, move, list, and inspect files anywhere the user has granted scope — Downloads, Documents, project folders outside the IDE, or any other granted folders. Useful for organize-and-clean tasks, batch document work, and any "look at the file my coworker just sent me" request that doesn't fit inside the IDE sandbox.

Surgical edits to existing files. The edit_block tool does exact-string find-and-replace with built-in safety: ambiguous matches fail loudly instead of silently overwriting the wrong thing, and an expected_replacements count prevents partial-match disasters. Lower data-loss risk than rewriting whole files based on the slice you happened to read — though a wrong old_string or wrong expected_replacements can still corrupt content, so review the changed content before considering the edit done.

Binary and structured files handled directly by the MCP. Excel, DOCX, and PDF are first-class — read and modified through format-specific mechanisms rather than text-only approximations: Excel via cell-range JSON, DOCX via raw-XML edits, PDF via page-level operations on a new output file. The result is the real file in its original format, not a regenerated approximation. Images and PDFs return as viewable content for the agent.

Search at scale. Streaming, ripgrep-backed search across whole projects or folder trees. The agent picks between filename search and in-file content search, pages through results progressively without flooding context, and runs multiple concurrent searches when the query is ambiguous.

Remote machines via SSH. A long-lived SSH session inside a persistent shell turns the agent into a real ops tool: connect once, then tail logs, run diagnostics, deploy, or debug across many turns without reconnecting each step.

Process management. List, inspect, tail, and kill accessible processes (subject to OS permissions). Useful for cleaning up stale dev servers from previous sessions and for diagnosing CPU / memory issues.

Example workflows

Each example names the actual tool sequence. Calls below are written in pseudocode shorthand (tool_name("arg", flag=value)); the real tools take object-shaped arguments. Tool descriptions and full parameter sets live in the MCP itself.

"Debug this production issue"

Before running production-impacting SSH commands, explain the intended action and get user confirmation when the risk is non-trivial.

start_process("ssh user@prod.example.com", timeout_ms=...) opens a long-lived SSH session and returns a PID. interact_with_process(pid, "tail -f /var/log/app.log\n") starts streaming logs. Subsequent turns: read_process_output(pid, offset=-50) to see the last 50 lines as they arrive, interact_with_process(pid, "...") to run diagnostic commands in the same session. force_terminate(pid) to close the session when done — for sessions opened by start_process, force_terminate is the correct cleanup tool; kill_process is for arbitrary OS PIDs found via list_processes.

"Deploy this to staging"

Before deploys, restarts, migrations, or other environment-changing commands, summarize the action and confirm with the user unless they already explicitly asked for that exact operation.

start_process for the deploy command (could be a script, an SSH-piped command, or kubectl/gh etc.). read_process_output to track output and surface errors. If the deploy needs an interactive confirmation, interact_with_process(pid, "yes\n"). The session stays alive while the agent watches for completion or rollback.

"Run the dev server and iterate on the API"

start_process("npm run dev", timeout_ms=...) keeps the server up. The agent then loops: edit_block on the route file, read_process_output(pid, offset=-30) to see the server's reload, start_process("curl -s http://localhost:3000/api/...") for a one-shot test, repeat. The dev server never has to restart between code changes.

"Refactor across this monorepo"

start_search(pattern="oldFunctionName", path=repo_root, searchType="content") scopes every call site. get_more_search_results(sessionId) pages through. read_multiple_files(paths=[...]) confirms ambiguous hits in context. edit_block(file_path, old_string, new_string) per site, with expected_replacements set when the same substring legitimately appears multiple times in one file. Verify by re-running start_search on the old name and paging the results with get_more_search_results(sessionId) until the run completes — only then can you confirm zero remaining hits.

"Update the Q3 numbers in this spreadsheet and tweak the summary in the report"

read_file(path="/.../q3.xlsx", sheet="Revenue", range="A1:F50") returns the existing numbers as a JSON 2D array. edit_block(file_path="/.../q3.xlsx", range="Revenue!C12:C24", content=[[12345], ...]) updates the cells in place. For the report, DOCX editing is a two-read flow: first read_file(path="/.../report.docx") (offset 0) returns the document's outline (headings + paragraph text) so you can locate the summary section. Then read_file(path="/.../report.docx", offset=N, length=...) with N > 0 returns the raw underlying XML around that section — a non-zero offset is what flips the read into XML mode. Copy an XML fragment from that output as old_string and call edit_block(file_path, old_string, new_string) with the rewritten XML. The user gets back real .xlsx and .docx files, not regenerated approximations.

"Generate the Q3 report as a PDF"

Compose markdown content (header, table, charts via embedded HTML), then call write_pdf to render it to a new PDF file. The MCP's write_pdf tool description specifies the exact parameters and filename rules — follow that.

"Insert a cover page into this PDF"

write_pdf also supports modifying existing PDFs via an operations array (insert / delete pages). Use it for existing-PDF edits that produce a new PDF — adding a cover page, removing a section, merging in content from another file. See the write_pdf tool description for the operation shapes and parameter rules.

"Analyze this 200MB CSV"

start_process("python3 -i", timeout_ms=...) opens a Python REPL and returns a PID. interact_with_process(pid, "import pandas as pd; df = pd.read_csv('/abs/path.csv')") loads it once. Every subsequent question — df.describe(), df.groupby('col').size(), plot a chart — runs in the same already-loaded REPL. Libraries don't re-import, the dataframe doesn't re-load. The MCP itself recommends this workflow for any local data-file analysis.

"Run a quick Node script"

start_process("node:local", timeout_ms=...) opens a stateless Node execution mode on the MCP server itself — ES imports supported. start_process opens the runner; each piece of JS is sent via interact_with_process(pid, "<your JS here>") and runs independently (no shared state between calls). Good for one-shot transformations where keeping a long-lived REPL alive isn't worth it. Don't try to put code into the start_process command argument — only the runner type (node:local) goes there.

"Explain this codebase"

list_directory(path=repo_root, depth=3) for shape. start_search(pattern="export ", path=repo_root, searchType="content") to find the public surface. read_multiple_files(paths=[entrypoints]) for the actual code. The agent can keep narrowing without re-asking the user where to look.

"Organize my Downloads folder"

Resolve the path to absolute first (e.g., /Users/<user>/Downloads, not ~/Downloads). Then list_directory(path="/Users/<user>/Downloads", depth=1) to see what's there. start_search(pattern="*.pdf", path="/Users/<user>/Downloads", searchType="files") and similar for other types. create_directory for new folders. move_file per item. Preview the move plan before executing destructive ops.

"Onboard me — what was happening last session?"

get_recent_tool_calls(maxResults=200) returns recent activity with arguments and outputs. list_sessions shows still-running terminal sessions. list_searches shows in-flight searches. list_processes shows what's still alive. Together they reconstruct the work without asking the user to recap.

"Why isn't the REPL responding?"

list_sessions — if Blocked: true, the REPL is waiting for input rather than hung. read_process_output(pid, offset=-100) to see what it last printed (often a prompt). interact_with_process(pid, "<the input it's waiting for>\n") unblocks it.

Core tool inventory

Grouped index of the tools an agent reaches for most often. Not exhaustive — the MCP exposes additional config / diagnostics / feedback tools beyond this list. Detailed parameters and return shapes for every tool are in the MCP's own tool descriptions.

Process / shell: start_process, interact_with_process, read_process_output, list_processes, list_sessions, kill_process, force_terminate
Files (read/write): read_file, read_multiple_files, write_file, edit_block, write_pdf
Filesystem: list_directory, get_file_info, move_file, create_directory
Search: start_search, get_more_search_results, list_searches, stop_search
Diagnostics / config: get_recent_tool_calls, get_config

Conventions

Prefer absolute paths. Relative paths may fail depending on the working directory, and tilde paths (~/...) may not expand in all contexts. Absolute paths are the most reliable; pass them whenever you can.

Allowed-directory scope. File operations only work inside the user's configured allowedDirectories. Expect [DENIED] markers in list_directory output and rejections from read_file / write_file when the path is out of scope. Surface the rejected path to the user — don't retry.

When running on macOS: default shell is zsh. Use python3 not python. Some GNU tools have prefixed names (gsed for GNU sed). brew is the typical package manager. open opens files / apps from the terminal, mdfind is the fastest path to exact-filename search via Spotlight. Detect the host platform via get_config (or by inspecting process.platform / uname from a shell) before assuming any of the above — Windows and Linux hosts behave differently.

Pagination. Long outputs (file reads, process output, search results) all support offset and length. Negative offsets read from the end (tail mode). Use these instead of dumping huge results into context.

原文・著作権は Anthropic および各プラグイン作者に帰属します。日本語訳は Claude API による自動翻訳です。