スキルKnowledge Work

🧠knowledge-synthesis

プラグイン: Enterprise Search
ソース: GitHub で見る ↗

説明

複数のソースからの検索結果を統合し、重複を排除した上で、ソースの帰属情報を付与しながら一貫性のある回答を生成します。鮮度と信頼性に基づく確信度スコアリングを処理し、大規模な結果セットを効果的に要約します。

原文を表示

Combines search results from multiple sources into coherent, deduplicated answers with source attribution. Handles confidence scoring based on freshness and authority, and summarizes large result sets effectively.

ユースケース

✓複数のソースから検索結果を統合するとき
✓重複した情報を排除したいとき
✓ソースの帰属を明示しながら回答するとき
✓大規模な結果セットを要約するとき

本文（日本語訳）

ナレッジ・シンセシス

エンタープライズ検索の最後の一マイル。複数のソースから得られた生の検索結果を受け取り、一貫性のある信頼できる回答を生成します。

目標

以下のような生の結果を：

~~chat の結果: "Sarah が #eng で発言: 'REST にしよう、GraphQL は今回のユースケースには過剰だ'"
~~email の結果: "件名: API Decision — Sarah が REST 採用を正式に確認したメール（理由付き）"
~~cloud storage の結果: "API Design Doc v3 — セクション 2 を REST 決定に合わせて更新"
~~project tracker の結果: "タスク: API 方針の確定 — Sarah により完了マーク済み"

以下のような回答へと変換します：

チームは API 再設計において GraphQL ではなく REST を採用することを決定しました。
Sarah が判断を下し、GraphQL は現在のユースケースには過剰であると指摘しました。
この件は火曜日に #engineering で議論され、水曜日にメールで正式確認され、
設計ドキュメントも決定内容を反映して更新済みです。
関連する ~~project tracker のタスクは完了としてマークされています。

出典:
- ~~chat: #engineering スレッド（1月14日）
- ~~email: Sarah からの "API Decision"（1月15日）
- ~~cloud storage: "API Design Doc v3"（1月15日更新）
- ~~project tracker: "Finalize API approach"（1月15日完了）

重複排除

ソースをまたいだ重複排除

同じ情報が複数の場所に存在することはよくあります。重複を特定し、統合してください。

同一情報と判断するシグナル:

テキスト内容が同一または非常に類似している
作成者・送信者が同一
タイムスタンプが近い（同日または隣接する日）
同じエンティティへの参照（プロジェクト名、ドキュメント、決定事項）
あるソースが別のソースを参照している（「~~chat で議論したとおり」「メールによれば」「ドキュメント参照」など）

統合の方法:

ひとつのナレッジ項目としてまとめる
登場したすべてのソースを出典として記載する
最も完全なバージョンを主テキストとして使用する
各ソース固有の詳細情報を補足する

重複排除の優先順位

同じ情報が複数のソースに存在する場合、以下の優先順位で採用してください：

1. 最も完全なバージョン（文脈が最も豊富なもの）
2. 最も権威あるソース（公式ドキュメント > チャット）
3. 最新バージョン（変化する情報は最新のものを優先）

重複排除してはいけないケース

以下の場合は別項目として保持してください：

同じトピックが議論されているが、結論が異なる
異なる人物が異なる見解を示している
ソース間で情報が意味のある形で変化している（決定事項の v1 と v2 など）
異なる時期のものが含まれている

引用とソース帰属

合成した回答の中のすべての主張は、出典に紐付けられなければなりません。

帰属の書式

直接参照の場合はインライン形式で：

Sarah は水曜日のメールで REST 採用を正式に確認しました。
設計ドキュメントもこれを反映して更新されています（~~cloud storage: "API Design Doc v3"）。

網羅性のためにソースリストを末尾に：

出典:
- ~~chat: #engineering での議論（1月14日）— 最初の決定スレッド
- ~~email: Sarah Chen からの "API Decision"（1月15日）— 正式確認
- ~~cloud storage: "API Design Doc v3" 最終更新 1月15日 — 更新済み仕様書

帰属のルール

ソースの種類を必ず明記する（~~chat、~~email、~~cloud storage など）
具体的な場所を含める（チャンネル、フォルダ、スレッド）
日付または相対的な時間を含める
関連する場合は作成者を含める
利用可能な場合はドキュメント・スレッドのタイトルを含める
~~chat の場合はチャンネル名を記載する
~~email の場合は件名と送信者を記載する
~~cloud storage の場合はドキュメントタイトルを記載する

信頼度レベル

すべての結果が等しく信頼できるわけではありません。以下の基準に基づいて信頼度を評価してください。

鮮度

新しさ	信頼度への影響
今日・昨日	現在の状態として高信頼
今週	良好な信頼度
今月	中程度 — 変化している可能性あり
1ヶ月以上前	低信頼度 — 古い可能性があることを明示

ステータスに関するクエリでは鮮度を強く重視してください。ポリシーや事実に関するクエリでは鮮度の比重は小さくなります。

権威性

ソースの種類	権威レベル
公式 Wiki・ナレッジベース	最高 — キュレートされ、管理されている
共有ドキュメント（最終版）	高 — 意図的に公開されたもの
メールによるアナウンス	高 — 正式なコミュニケーション
ミーティングメモ	中〜高 — 不完全な場合あり
チャットメッセージ（スレッドの結論）	中 — 非公式だがリアルタイム
チャットメッセージ（スレッド途中）	低 — 最終的な立場を反映していない可能性あり
ドラフトドキュメント	低 — 未確定
タスクコメント	文脈依存 — コメント者による

信頼度の表現

信頼度が高い場合（複数の新鮮で権威あるソースが一致している）：

チームは API 再設計に REST を使用することを決定しました。[断定的な表現]

信頼度が中程度の場合（単一ソース、またはやや古い情報）：

先月の #engineering での議論によると、チームは API 再設計に
REST を採用する方向で検討していたようです。その後変化している可能性があります。

信頼度が低い場合（古いデータ、非公式なソース、または矛盾するシグナル）：

~~chat に3ヶ月前の API 移行に関する議論の記録が見つかりましたが、
正式な決定ドキュメントは見つかりませんでした。情報が古い可能性があります。
現在の状況についてチームに直接確認することをお勧めします。

情報の矛盾

ソース間で内容が食い違う場合：

API 方針について矛盾する情報が見つかりました：
- 1月10日の ~~chat の議論では GraphQL が提案されていた
- しかし1月15日の Sarah のメールでは REST が確認されている
- 設計ドキュメント（1月15日更新）は REST を反映している

最新のソースでは REST が最終決定であることが示されていますが、
初期の ~~chat 議論では GraphQL が最初に検討されていました。

矛盾は必ず表面化させてください。どちらか一方を黙って選ぶことはしないでください。

要約戦略

少量の結果（1〜5件）

各結果を文脈とともに提示します。要約は不要です — すべての情報をユーザーに提供してください：

[結果から合成した直接的な回答]

[ソース1の詳細]
[ソース2の詳細]

出典: [完全な帰属情報]

中量の結果（5〜15件）

テーマ別にグループ化し、各グループを要約します：

[全体的な回答]

テーマ1: [関連する結果のまとめ]
テーマ2: [関連する結果のまとめ]

主要な出典: [最も関連性の高い3〜5件]
全結果: [ソース名] 全体で [件数] 件が見つかりました

大量の結果（15件以上）

高レベルの合成を提示し、詳細を掘り下げるオプションを提供します：

[最も関連性の高い結果に基づく全体的な回答]

まとめ:
- [主要な発見1]（N 件のソースに裏付けられています）
- [主要な発見2]（N 件のソースに裏付けられています）
- [主要な発見3]（N 件のソースに裏付けられています）

主要ソース:
- [最も権威ある・関連性の高いソース]
- [2番目に関連性の高いソース]
- [3番目に関連性の高いソース]

[ソース一覧] 全体で合計 [件数] 件の結果が見つかりました。
特定の観点についてさらに詳しく調べますか？

要約のルール

検索プロセスではなく、回答から始める
生の結果をそのまま列挙しない — ナレッジとして統合する
異なるソースの関連項目をまとめてグループ化する
重要なニュアンスや留意点を保持する
ユーザーがさらに掘り下げるかどうか判断できるだけの詳細を含める
結果セットが大量の場合は、詳細の提供を申し出る

合成ワークフロー

[全ソースからの生の結果]
          ↓
[1. 重複排除 — 異なるソースの同一情報を統合]
          ↓
[2. クラスタリング — 関連する結果をテーマ・トピック別にグループ化]
          ↓
[3. ランキング — クラスターと項目をクエリとの関連性順に並べる]
          ↓
[4. 信頼度評価 — 鮮度 × 権威性 × 一致度]
          ↓
[5. 合成 — 帰属情報付きのナレッジ回答を生成]
          ↓
[6. フォーマット — 結果件数に応じた適切な詳細レベルを選択]
          ↓
[出典付きの一貫した回答]

アンチパターン

やってはいけないこと:

ソースごとに結果を列挙する（「~~chat より: … ~~email より: … ~~cloud storage より: …」）
キーワードに一致したというだけで無関係な結果を含める
方法論の説明の後ろに回答を埋め込む
矛盾する情報を矛盾として示さずに提示する
ソースの帰属を省略する
根拠の薄い情報を、十分に裏付けられた事実と同じ確信度で提示する
有益な詳細が失われるほど過度に要約する

やるべきこと:

回答から始める
ソース別ではなくトピック別にグループ化する
必要に応じて信頼度レベルを示す
矛盾を明示的に表面化させる
すべての主張にソースを帰属させる
結果セットが大量の場合は、詳細を掘り下げることを申し出る

原文（English）を表示

Knowledge Synthesis

The last mile of enterprise search. Takes raw results from multiple sources and produces a coherent, trustworthy answer.

The Goal

Transform this:

~~chat result: "Sarah said in #eng: 'let's go with REST, GraphQL is overkill for our use case'"
~~email result: "Subject: API Decision — Sarah's email confirming REST approach with rationale"
~~cloud storage result: "API Design Doc v3 — updated section 2 to reflect REST decision"
~~project tracker result: "Task: Finalize API approach — marked complete by Sarah"

Into this:

The team decided to go with REST over GraphQL for the API redesign. Sarah made the
call, noting that GraphQL was overkill for the current use case. This was discussed
in #engineering on Tuesday, confirmed via email Wednesday, and the design doc has
been updated to reflect the decision. The related ~~project tracker task is marked complete.

Sources:
- ~~chat: #engineering thread (Jan 14)
- ~~email: "API Decision" from Sarah (Jan 15)
- ~~cloud storage: "API Design Doc v3" (updated Jan 15)
- ~~project tracker: "Finalize API approach" (completed Jan 15)

Deduplication

Cross-Source Deduplication

The same information often appears in multiple places. Identify and merge duplicates:

Signals that results are about the same thing:

Same or very similar text content
Same author/sender
Timestamps within a short window (same day or adjacent days)
References to the same entity (project name, document, decision)
One source references another ("as discussed in ~~chat", "per the email", "see the doc")

How to merge:

Combine into a single narrative item
Cite all sources where it appeared
Use the most complete version as the primary text
Add unique details from each source

Deduplication Priority

When the same information exists in multiple sources, prefer:

1. The most complete version (fullest context)
2. The most authoritative source (official doc > chat)
3. The most recent version (latest update wins for evolving info)

What NOT to Deduplicate

Keep as separate items when:

The same topic is discussed but with different conclusions
Different people express different viewpoints
The information evolved meaningfully between sources (v1 vs v2 of a decision)
Different time periods are represented

Citation and Source Attribution

Every claim in the synthesized answer must be attributable to a source.

Attribution Format

Inline for direct references:

Sarah confirmed the REST approach in her email on Wednesday.
The design doc was updated to reflect this (~~cloud storage: "API Design Doc v3").

Source list at the end for completeness:

Sources:
- ~~chat: #engineering discussion (Jan 14) — initial decision thread
- ~~email: "API Decision" from Sarah Chen (Jan 15) — formal confirmation
- ~~cloud storage: "API Design Doc v3" last modified Jan 15 — updated specification

Attribution Rules

Always name the source type (~~chat, ~~email, ~~cloud storage, etc.)
Include the specific location (channel, folder, thread)
Include the date or relative time
Include the author when relevant
Include document/thread titles when available
For ~~chat, note the channel name
For ~~email, note the subject line and sender
For ~~cloud storage, note the document title

Confidence Levels

Not all results are equally trustworthy. Assess confidence based on:

Freshness

Recency	Confidence impact
Today / yesterday	High confidence for current state
This week	Good confidence
This month	Moderate — things may have changed
Older than a month	Lower confidence — flag as potentially outdated

For status queries, heavily weight freshness. For policy/factual queries, freshness matters less.

Authority

Source type	Authority level
Official wiki / knowledge base	Highest — curated, maintained
Shared documents (final versions)	High — intentionally published
Email announcements	High — formal communication
Meeting notes	Moderate-high — may be incomplete
Chat messages (thread conclusions)	Moderate — informal but real-time
Chat messages (mid-thread)	Lower — may not reflect final position
Draft documents	Low — not finalized
Task comments	Contextual — depends on commenter

Expressing Confidence

When confidence is high (multiple fresh, authoritative sources agree):

The team decided to use REST for the API redesign. [direct statement]

When confidence is moderate (single source or somewhat dated):

Based on the discussion in #engineering last month, the team was leaning
toward REST for the API redesign. This may have evolved since then.

When confidence is low (old data, informal source, or conflicting signals):

I found a reference to an API migration discussion from three months ago
in ~~chat, but I couldn't find a formal decision document. The information
may be outdated. You might want to check with the team for current status.

Conflicting Information

When sources disagree:

I found conflicting information about the API approach:
- The ~~chat discussion on Jan 10 suggested GraphQL
- But Sarah's email on Jan 15 confirmed REST
- The design doc (updated Jan 15) reflects REST

The most recent sources indicate REST was the final decision,
but the earlier ~~chat discussion explored GraphQL first.

Always surface conflicts rather than silently picking one version.

Summarization Strategies

For Small Result Sets (1-5 results)

Present each result with context. No summarization needed — give the user everything:

[Direct answer synthesized from results]

[Detail from source 1]
[Detail from source 2]

Sources: [full attribution]

For Medium Result Sets (5-15 results)

Group by theme and summarize each group:

[Overall answer]

Theme 1: [summary of related results]
Theme 2: [summary of related results]

Key sources: [top 3-5 most relevant sources]
Full results: [count] items found across [sources]

For Large Result Sets (15+ results)

Provide a high-level synthesis with the option to drill down:

[Overall answer based on most relevant results]

Summary:
- [Key finding 1] (supported by N sources)
- [Key finding 2] (supported by N sources)
- [Key finding 3] (supported by N sources)

Top sources:
- [Most authoritative/relevant source]
- [Second most relevant]
- [Third most relevant]

Found [total count] results across [source list].
Want me to dig deeper into any specific aspect?

Summarization Rules

Lead with the answer, not the search process
Do not list raw results — synthesize them into narrative
Group related items from different sources together
Preserve important nuance and caveats
Include enough detail that the user can decide whether to dig deeper
Always offer to provide more detail if the result set was large

Synthesis Workflow

[Raw results from all sources]
          ↓
[1. Deduplicate — merge same info from different sources]
          ↓
[2. Cluster — group related results by theme/topic]
          ↓
[3. Rank — order clusters and items by relevance to query]
          ↓
[4. Assess confidence — freshness × authority × agreement]
          ↓
[5. Synthesize — produce narrative answer with attribution]
          ↓
[6. Format — choose appropriate detail level for result count]
          ↓
[Coherent answer with sources]

Anti-Patterns

Do not:

List results source by source ("From ~~chat: ... From ~~email: ... From ~~cloud storage: ...")
Include irrelevant results just because they matched a keyword
Bury the answer under methodology explanation
Present conflicting info without flagging the conflict
Omit source attribution
Present uncertain information with the same confidence as well-supported facts
Summarize so aggressively that useful detail is lost

Do:

Lead with the answer
Group by topic, not by source
Flag confidence levels when appropriate
Surface conflicts explicitly
Attribute all claims to sources
Offer to go deeper when result sets are large

原文・著作権は Anthropic および各プラグイン作者に帰属します。日本語訳は Claude API による自動翻訳です。