🌐iot
- プラグイン
- aws-dev-toolkit
- ソース
- GitHub で見る ↗
説明
AWS IoTアーキテクチャ、デバイス接続、エッジコンピューティング、フリート管理について深く掘り下げます。 次のような場合に使用: ユーザーが「IoTソリューションの設計」「AWSへのデバイス接続」「MQTTメッセージングのセットアップ」「IoTルールの設定」「デバイスフリートのプロビジョニング」「エッジでのGreengrassの使用」「Device Shadowの構築」「IoTセキュリティのセットアップ」「OTAアップデートの管理」「テレメトリデータの保存」「IoTトピックルールの作成」「フリートプロビジョニングの設定」を要求する場合、 またはIoT Core、MQTT、Greengrass、Device Shadow、IoT Rules Engine、IoT Events、IoT SiteWise、フリートインデックス、デバイス証明書について言及している場合。
原文を表示
Deep-dive into AWS IoT architecture, device connectivity, edge computing, and fleet management. This skill should be used when the user asks to "design an IoT solution", "connect devices to AWS", "set up MQTT messaging", "configure IoT rules", "provision a device fleet", "use Greengrass at the edge", "build a device shadow", "set up IoT security", "manage OTA updates", "store telemetry data", "create IoT topic rules", "configure fleet provisioning", or mentions IoT Core, MQTT, Greengrass, Device Shadow, IoT Rules Engine, IoT Events, IoT SiteWise, fleet indexing, or device certificates.
ユースケース
- ✓IoTソリューション設計するとき
- ✓AWSへのデバイス接続を構築するとき
- ✓MQTTメッセージング環境をセットアップするとき
- ✓IoTルール・デバイスフリートを設定するとき
- ✓エッジコンピューティング環境を構築するとき
本文
Specialist guidance for AWS IoT. Covers IoT Core (MQTT, shadows, rules engine), Greengrass v2 edge compute, fleet provisioning, security, data storage patterns, and fleet management.
Process
- Identify the IoT workload characteristics: device count, message frequency, payload size, connectivity (always-on vs intermittent), edge processing needs
- Use the
awsknowledgeMCP tools (mcp__plugin_aws-dev-toolkit_awsknowledge__aws___search_documentation,mcp__plugin_aws-dev-toolkit_awsknowledge__aws___read_documentation,mcp__plugin_aws-dev-toolkit_awsknowledge__aws___recommend) to verify current IoT Core limits, Greengrass component versions, and service quotas - Select the appropriate IoT services using the decision matrix below
- Design the communication and data ingestion topology (protocols, topics, rules)
- Configure security (X.509 certificates, IoT policies, fleet provisioning method)
- Design data storage and analytics pipeline
- Plan fleet management (jobs, indexing, Device Defender)
- Recommend operational best practices (monitoring, OTA updates, edge deployments)
IoT Service Selection Decision Matrix
| Requirement | Recommendation | Why |
|---|---|---|
| Devices sending telemetry to cloud | IoT Core (MQTT) | Persistent connections, sub-second latency, bidirectional, scales to millions of concurrent connections |
| Request/response from constrained devices | IoT Core (HTTPS) | Stateless, no persistent connection needed, but higher latency and no server-to-device push |
| Browser or mobile app to IoT backend | IoT Core (MQTT over WebSocket) | Works through firewalls/proxies, uses IAM or Cognito auth instead of X.509 certificates |
| Edge preprocessing before cloud upload | Greengrass v2 | Reduces bandwidth cost and cloud ingestion volume by filtering/aggregating at the edge |
| Local device control when internet is down | Greengrass v2 | Local MQTT broker keeps device-to-device communication working during cloud disconnection |
| Industrial OPC-UA data collection | IoT SiteWise | Purpose-built for industrial protocols, asset modeling, and time-series with SiteWise Edge gateway |
| State machine on device events | IoT Events | Detector models react to patterns across multiple devices without custom Lambda logic |
| Time-series telemetry storage | Timestream | Purpose-built for time-series with automatic tiering (memory to magnetic), built-in interpolation and aggregation functions |
| Device metadata and state lookups | DynamoDB | Single-digit ms latency for key-value access to device config, state, and registry data |
| Bulk telemetry archival | S3 | Cheapest storage for raw telemetry; query with Athena when needed |
| Telemetry search and dashboards | OpenSearch | Full-text search and Kibana/OpenSearch Dashboards for operational visibility |
Protocol Selection
MQTT (Default Choice)
Use MQTT for device-to-cloud communication unless there is a specific reason not to. MQTT uses persistent TCP connections with minimal overhead (2-byte header minimum), supports QoS 0 (at most once) and QoS 1 (at least once), and enables server-initiated push to devices via subscriptions.
- QoS 0: Use for high-frequency telemetry where occasional message loss is acceptable (sensor readings every second). Lower overhead because no acknowledgment round-trip.
- QoS 1: Use for commands, configuration changes, and alerts where delivery must be confirmed. The broker retries until PUBACK is received.
- QoS 2 is not supported by AWS IoT Core. If exactly-once semantics are required, implement idempotency in the application layer.
MQTT v5 Features (Prefer When Devices Support It)
- Shared subscriptions: Distribute messages across multiple subscribers for load balancing backend processors, avoiding hot-partition on a single consumer
- Topic aliases: Replace long topic strings with short integer aliases after first publish, reducing per-message overhead for bandwidth-constrained devices
- Message expiry: Set TTL on messages so stale commands are discarded rather than delivered to a device that reconnects hours later
- Session expiry: Control how long the broker holds session state after disconnect, preventing unbounded memory growth from abandoned devices
HTTPS
Use HTTPS only for devices that wake up, send a single reading, and sleep (battery-powered sensors with cellular connectivity). HTTPS does not support subscriptions, so the device cannot receive commands without polling. Every request incurs TLS handshake overhead.
MQTT over WebSocket
Use for browser-based dashboards and mobile apps that need real-time device data. Authenticates with IAM credentials or Cognito identity pools instead of X.509 certificates. Works through corporate proxies and firewalls that block raw TCP on port 8883.
Topic Design
Design topics as a hierarchy with device identity and data type segments. This enables fine-grained IoT policy access control and targeted rules engine subscriptions.
Recommended Structure
{org}/{environment}/{device-type}/{device-id}/{data-category}
Examples:
acme/prod/temperature-sensor/sensor-001/telemetry
acme/prod/temperature-sensor/sensor-001/alerts
acme/prod/temperature-sensor/sensor-001/commands
acme/prod/temperature-sensor/+/telemetry # Rule subscribes to all sensors
Topic Design Rules
- Include the device ID in the topic so IoT policies can use
${iot:Connection.Thing.ThingName}to restrict each device to its own topics - Separate telemetry, commands, and alerts into distinct subtopics so rules can target specific data types without parsing payloads
- Use
+(single-level) and#(multi-level) wildcards in rules and subscriptions, never in publish topics - Keep topics under 7 levels deep to stay within IoT Core limits and maintain readability
Basic Ingest
For high-volume telemetry that goes directly to rules engine actions without needing the message broker, use the $aws/rules/<rule-name> topic prefix. Basic Ingest skips the message broker publish cost ($1.00 per million messages), saving significant cost at scale. The tradeoff: messages sent via Basic Ingest cannot be received by other MQTT subscribers.
Device Shadow
Device Shadow maintains a JSON document of desired and reported state for each device. Use shadows when cloud applications need to read or set device state regardless of whether the device is currently connected.
Classic vs Named Shadows
- Classic shadow: One per thing. Use for the primary device state (power on/off, firmware version, connectivity status).
- Named shadows: Up to 10 per thing. Use to separate independent state concerns (e.g., one shadow for configuration, another for diagnostics, another for firmware). Named shadows avoid state conflicts when multiple applications update different aspects of the same device.
Shadow Best Practices
- Keep shadow documents small (<8 KB). Large shadows increase MQTT message size and DynamoDB read/write costs on the shadow service backend.
- Use
reportedstate from the device,desiredstate from the cloud application. Thedeltafield tells the device what to change. - Set version-based optimistic locking on updates to prevent stale writes from overwriting newer state.
IoT Rules Engine
The rules engine evaluates SQL statements against incoming MQTT messages and routes matching data to AWS service actions. Every production deployment should have at least one rule for data ingestion and error handling.
Rule SQL Basics
SELECT temperature, humidity, timestamp() as ts, topic(4) as device_id
FROM 'acme/prod/temperature-sensor/+/telemetry'
WHERE temperature > 0 AND temperature < 150
topic(n)extracts the nth level from the topic string (1-indexed)timestamp()adds server-side UTC timestampWHEREclause filters before action execution, reducing downstream processing cost- Use
SELECT *sparingly; extract only the fields needed to minimize action payload size
Action Selection Guide
| Data Destination | Rule Action | When to Use |
|---|---|---|
| Real-time processing | Lambda | Custom transformation, enrichment, or fan-out logic |
| Time-series storage | Timestream | Telemetry that needs time-range queries and aggregation |
| Key-value lookups | DynamoDB / DynamoDBv2 | Device metadata, latest state, configuration |
| Streaming analytics | Kinesis Data Streams | High-throughput ingestion for real-time analytics pipelines |
| Bulk archival | S3 | Raw telemetry archival for compliance or batch analytics |
| Notifications | SNS | Alert routing to email, SMS, or HTTP endpoints |
| Decoupled processing | SQS | Buffer messages for downstream consumers that process at their own rate |
| State machine triggers | IoT Events | Multi-device event correlation and complex event processing |
| Republish | IoT Core republish | Route to another MQTT topic for device-to-device via cloud |
| Search and dashboards | OpenSearch | Operational dashboards and full-text search over telemetry |
Error Actions (Always Configure)
Every rule must have an error action. Without one, failed rule actions silently drop data with no notification and no retry. Configure error actions to route failures to S3 or SQS for later reprocessing.
See references/rules-engine-patterns.md for detailed SQL examples and error action configuration.
IoT SiteWise (Industrial IoT)
Use IoT SiteWise instead of raw IoT Core + custom storage when the workload involves industrial equipment with OPC-UA data sources, asset hierarchies, and time-series metrics that need automatic aggregation (min, max, avg, count over time windows).
When to Use IoT SiteWise
- Industrial environments with OPC-UA or Modbus data sources
- Need for asset hierarchy modeling (factory > line > machine > sensor)
- Pre-built portal/dashboard capabilities for operators (SiteWise Monitor)
- Edge data collection and processing via SiteWise Edge gateway
When to Skip IoT SiteWise
- Consumer IoT devices using MQTT natively (use IoT Core directly)
- Custom data formats that do not fit the asset model structure
- Workloads already using Timestream with custom dashboards (Grafana)
IoT Events
Use IoT Events when device telemetry needs to trigger state-machine logic across multiple devices or time windows, and the logic is too complex for simple IoT Rules Engine WHERE clauses.
Detector Models
- Define states (e.g., NORMAL, WARNING, CRITICAL) with transitions based on input conditions
- Each detector instance tracks state for one device independently
- Actions on state entry/exit/transition: send SNS, publish to IoT Core, invoke Lambda, write to DynamoDB
- Use for: equipment health monitoring, multi-sensor correlation, threshold-with-hysteresis alerting (avoid alert flapping by requiring sustained condition before state change)
Fleet Provisioning
Method Selection
| Scenario | Method | Why |
|---|---|---|
| Factory installs unique certs per device | JITP (Just-in-Time Provisioning) | Simplest: device connects, CA is recognized, thing is auto-created. Requires trusted manufacturing chain. |
| Factory installs unique certs, need custom validation | JITR (Just-in-Time Registration) | Lambda hook validates additional attributes before activating the certificate |
| Cannot install unique certs during manufacturing | Fleet Provisioning by Claim | Devices share a claim certificate, exchange it for a unique identity on first boot. Use pre-provisioning Lambda hook to validate serial numbers against an allow-list. |
| End user or installer provisions device | Fleet Provisioning by Trusted User | Mobile app generates temporary credentials for the device. Highest security for consumer devices. |
Provisioning Best Practices
- Always use a pre-provisioning Lambda hook with fleet provisioning by claim to validate the device identity against an allow-list. Without this, anyone with the claim certificate can provision unlimited devices.
- Scope provisioning templates to create minimal IoT policies. The provisioned policy should grant access only to that device's topics, using
${iot:Connection.Thing.ThingName}policy variables. - Store device private keys in hardware security modules (HSM) or secure elements when available. Software-stored keys are extractable.
See references/security-provisioning.md for provisioning templates, certificate management, and IoT policy examples.
Security
X.509 Certificates
- Every device must authenticate with a unique X.509 client certificate. Shared certificates across devices make revocation impossible without affecting the entire fleet.
- Use AWS Private CA for production fleets. It provides automated certificate issuance, revocation (CRL), and integration with JITP.
- Rotate certificates before expiry using IoT Jobs to push new certificates and a Lambda to register them. Expired certificates cause immediate connection failure with no grace period.
IoT Policies
IoT policies control what MQTT topics a device can publish/subscribe to and what shadows/jobs it can access. Always use policy variables to scope per-device.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": "iot:Connect",
"Resource": "arn:aws:iot:REGION:ACCOUNT:client/${iot:Connection.Thing.ThingName}"
},
{
"Effect": "Allow",
"Action": "iot:Publish",
"Resource": "arn:aws:iot:REGION:ACCOUNT:topic/acme/prod/*/${iot:Connection.Thing.ThingName}/*"
},
{
"Effect": "Allow",
"Action": "iot:Subscribe",
"Resource": "arn:aws:iot:REGION:ACCOUNT:topicfilter/acme/prod/*/${iot:Connection.Thing.ThingName}/*"
}
]
}
Custom Authorizers
Use custom authorizers when devices cannot use X.509 certificates (e.g., legacy devices with token-based auth or OAuth). The authorizer is a Lambda function that validates the token and returns an IoT policy document. Custom authorizers add latency (Lambda cold start) and cost (per-invocation), so prefer X.509 certificates for new device designs.
Device Defender
- Audit: Scheduled checks for insecure configurations (overly permissive policies, shared certificates, disabled logging). Run at least weekly.
- Detect: Real-time anomaly detection on device metrics (message volume, connection patterns, authorization failures). Alerts when a device deviates from its baseline behavior, indicating compromise or misconfiguration.
- Configure mitigation actions to automatically quarantine compromised devices (move to a restricted thing group with minimal permissions).
Data Storage Patterns
Timestream (Time-Series Telemetry)
- Default choice for telemetry that needs time-range queries (temperature over last 24 hours, average power per hour).
- Automatic tiering: memory store (recent, fast queries) to magnetic store (historical, cheaper).
- Set memory store retention to match your hot-query window (1-24 hours typical). Data beyond this moves to magnetic automatically.
- Cost consideration: Timestream charges per write and per query scan. For very high-frequency telemetry (>1 msg/sec/device across thousands of devices), aggregate at the edge with Greengrass or use Basic Ingest to S3 with Athena for batch queries.
DynamoDB (Device Metadata and State)
- Use for device registry extensions, latest-known state, configuration, and command history.
- Design the partition key as the device ID for even distribution.
- Use TTL to auto-expire old command records and reduce storage cost.
- Do not store raw time-series telemetry in DynamoDB. At 1 msg/sec from 10,000 devices, that is 864 million writes/day, which costs roughly $1,100/day in on-demand WCU charges.
S3 (Bulk Archival)
- Use IoT Rules Engine S3 action with partitioned keys:
s3://bucket/year=2026/month=04/day=06/hour=12/device-id.json - Query archived data with Athena using partition projection for cost-effective ad-hoc analysis.
- Enable S3 Intelligent-Tiering for automatic cost optimization on infrequently accessed telemetry.
- Cheapest option for long-term retention and compliance requirements.
OpenSearch (Search and Analytics)
- Use when operators need full-text search across telemetry fields or real-time dashboards.
- IoT Rules Engine can write directly to OpenSearch Service.
- Cost consideration: OpenSearch clusters run 24/7 with dedicated instances. For intermittent analysis, prefer Athena on S3.
Greengrass v2 (Edge Compute)
When to Use Edge Compute
- Latency: Local control loops that must respond in <100ms (actuator control, safety shutoffs). Cloud round-trip adds 50-200ms minimum.
- Bandwidth: Devices generate more data than the network can upload. Aggregate or filter at the edge, send summaries to cloud.
- Intermittent connectivity: Sites with unreliable internet (remote oil wells, ships, mines). Greengrass buffers data and syncs when connected.
- Local ML inference: Run ML models on edge hardware (image classification, anomaly detection) without sending raw data to cloud.
When to Skip Edge Compute
- Devices with reliable, high-bandwidth connectivity and no latency requirements. Direct MQTT to IoT Core is simpler and eliminates edge infrastructure management.
- Very constrained devices (microcontrollers with <1MB RAM) that cannot run the Greengrass nucleus. Use FreeRTOS with direct IoT Core connectivity instead.
Component Model
Greengrass v2 uses a component model where each capability is a deployable unit (recipe + artifacts). Components can be:
- AWS-provided: Pre-built components for common tasks (stream manager, log manager, MQTT bridge, Docker application manager)
- Custom: Your application logic, packaged as a recipe (YAML/JSON) referencing artifacts (code, binaries, configs)
- Community: Third-party components from the Greengrass component catalog
Stream Manager
Use Stream Manager for reliable edge-to-cloud data transfer. It handles buffering, batching, bandwidth management, and automatic retry. Supports export to Kinesis Data Streams, S3, IoT Analytics, and IoT SiteWise.
- Configure per-stream: storage type (memory or file-system), max size, strategy when full (reject new or overwrite oldest)
- Set bandwidth limits to prevent telemetry uploads from starving control-plane traffic
- Minimum 70 MB RAM overhead for the stream manager component
See references/greengrass-patterns.md for component recipes, deployment configurations, and stream manager setup.
Fleet Management
IoT Jobs (OTA Updates)
- Use Jobs for firmware updates, configuration changes, and certificate rotation across the fleet.
- Continuous jobs: Automatically target new devices added to a thing group. Use for ongoing compliance (all devices in group X must have firmware v2.3+).
- Snapshot jobs: One-time execution against a fixed set of targets.
- Configure rollout rate (max devices per minute) and abort criteria (% failures before halting) to prevent fleet-wide bricking from a bad update.
- Use signed job documents with code signing to prevent tampering.
Fleet Indexing
- Enables SQL-like queries across device registry, shadow, connectivity, and Device Defender violation data.
- Must be explicitly enabled (off by default). Without fleet indexing, you cannot query fleet state at scale.
- Example:
thingName:sensor-* AND shadow.reported.firmware:v2.1 AND connectivity.connected:falsefinds all disconnected sensors on old firmware. - Use fleet metrics to push aggregated fleet statistics to CloudWatch for dashboards and alarms.
Key Limits (IoT Core)
| Resource | Default Limit | Notes |
|---|---|---|
| Maximum concurrent connections | 500,000 per account | Requestable increase |
| Maximum MQTT message size | 128 KB | Hard limit |
| Maximum publishes per second (per account) | 20,000 | Requestable increase |
| Maximum inbound publishes per second (per connection) | 100 | Per-device throttle |
| Persistent session expiry | 1 hour (default), up to 7 days | Configure per client |
| Maximum rules per account | 1,000 | Requestable increase |
| Maximum actions per rule | 10 | Hard limit |
| Maximum shadow document size | 8 KB (classic), 8 KB (named) | Hard limit |
| Named shadows per thing | 10 | Hard limit |
| Fleet provisioning templates per account | 256 | Requestable increase |
| Thing groups depth | 7 levels | Hard limit |
Anti-Patterns
- Polling instead of MQTT. Devices that HTTP poll for commands waste battery, bandwidth, and IoT Core request costs. A device polling every 5 seconds generates 17,280 requests/day; MQTT keeps a persistent connection with near-zero overhead when idle, and the server pushes commands instantly.
- No error actions on rules. Without an error action, a failed rule action (IAM permission issue, DynamoDB throttle, Lambda error) silently drops the message. There is no retry, no alert, and no way to recover the data. Always route errors to S3 or SQS.
- Overly permissive IoT policies (iot: on ). A compromised device with
iot:*can publish to any topic, read any shadow, and trigger any job. Use policy variables (${iot:Connection.Thing.ThingName}) to scope each device to its own resources. - Single MQTT topic for all devices. Publishing everything to
devices/telemetrymakes it impossible to apply per-device access control, filter rules by device type, or subscribe to a specific device's data. Use hierarchical topics with device identity segments. - Not using Device Shadow for desired/reported state sync. Without shadows, setting device state requires the device to be online at the exact moment the command is sent. Shadows persist the desired state and deliver it when the device reconnects.
- Storing raw telemetry in DynamoDB. At IoT scale, DynamoDB write costs explode. 10,000 devices at 1 msg/sec = 864M writes/day = ~$1,100/day on-demand. Use Timestream for time-series (10-20x cheaper for write-heavy time-series workloads) or S3 for archival ($0.023/GB/month).
- Ignoring Greengrass for edge preprocessing. Sending raw high-frequency sensor data to the cloud wastes bandwidth and inflates ingestion costs. A Greengrass component that averages 1,000 readings into 1 summary per minute reduces cloud costs by 99.9%.
- Not configuring fleet indexing. Without fleet indexing enabled, you cannot query which devices are running old firmware, which are disconnected, or which have specific shadow states. You are flying blind on fleet health. Enable it proactively.
- Shared X.509 certificates across devices. If one device is compromised, you must revoke the shared certificate, disconnecting all devices that use it. One certificate per device limits the blast radius to a single device.
- No rollout controls on IoT Jobs. Pushing a firmware update to all devices simultaneously risks fleet-wide failure. Always configure max concurrent targets, rollout rate, and abort thresholds (e.g., abort if >5% of devices fail).
- Ignoring Basic Ingest for high-volume telemetry. Standard publish costs $1.00 per million messages. Basic Ingest ($0.00 publish cost, rules actions still charged) saves this entirely for telemetry that only needs to flow to rules engine actions.
- Not setting MQTT session expiry. Default persistent session expiry is 1 hour. Devices that reconnect after longer disconnections lose queued messages. Set session expiry to match the device's expected offline duration (up to 7 days max).
Additional Resources
Reference Files
For detailed operational guidance, consult:
references/rules-engine-patterns.md-- Rule SQL examples for common routing patterns, error action configuration, topic structure best practices, and Basic Ingest setupreferences/security-provisioning.md-- X.509 certificate management, fleet provisioning templates (JITP, bulk, by claim), IoT policies with variables, and custom authorizer setupreferences/greengrass-patterns.md-- Greengrass v2 component recipes, deployment configurations, stream manager setup, and local MQTT bridge configuration
Related Skills
lambda-- Lambda functions as IoT rule actions and Greengrass componentsstep-functions-- Orchestrating multi-step device provisioning and remediation workflowsdynamodb-- Device metadata storage design, partition key strategy, TTL configurations3-- Telemetry archival, lifecycle policies, Athena integration for batch queriesmessaging-- SQS/SNS integration with IoT rules for decoupled processing and alertingobservability-- CloudWatch metrics, alarms, and dashboards for IoT fleet monitoringiam-- IAM roles for IoT rules engine actions, Greengrass token exchange, and fleet provisioningnetworking-- VPC endpoints for IoT Core, private connectivity for Greengrass core devicessecurity-review-- Security audit of IoT policies, certificate management, and Device Defender configuration
Output Format
When recommending an IoT architecture, include:
| Component | Choice | Rationale |
|---|---|---|
| Protocol | MQTT v5 over TLS 8883 | Bidirectional, persistent, low overhead |
| Authentication | X.509 per-device certificates via AWS Private CA | Hardware-bound identity, scalable revocation |
| Provisioning | Fleet Provisioning by Claim with pre-provisioning hook | Devices cannot be provisioned in factory |
| Topic Structure | {org}/prod/{type}/{device-id}/{category} |
Per-device access control, rule targeting |
| Telemetry Ingestion | IoT Rules Engine to Timestream (Basic Ingest) | Cost-effective time-series storage |
| Device State | Named Shadows (config + diagnostics) | Offline-tolerant desired/reported sync |
| Edge Compute | Greengrass v2 with Stream Manager | Local filtering, buffered cloud upload |
| Fleet Management | Jobs (OTA) + Fleet Indexing + Device Defender | Update, query, and audit the fleet |
| Alerting | IoT Events detector model to SNS | Multi-device state correlation |
Include estimated monthly cost range using the cost-check skill.
原文・著作権は Anthropic および各プラグイン作者に帰属します。日本語訳は Claude API による自動翻訳です。