System Architecture
System Architecture
High-Level Overview
System Architecture
Loading diagram...
Data Flow:
- Collect: Read sensors every 5 seconds
- Batch: Accumulate 180 readings (15 minutes)
- Write: Save as Hive-partitioned Parquet
- Sync: Upload to S3 automatically
- Analyze: Query with DuckDB from anywhere
Component Architecture
Component Architecture
Loading diagram...
Components:
- CLI: User interface (setup, start, sync, status)
- Collector: Reads sensors, manages batches
- Polars: Fast columnar data processing
- ObStore: Efficient S3 sync (Rust-based)
- Hive Partitioning: Time-based organization
Data Flow
Sensor to Cloud Pipeline
Sensor to Cloud Pipeline
Loading diagram...
Timing:
- Read Interval: 5 seconds
- Batch Duration: 900 seconds (15 minutes)
- Batch Size: ~180 readings
- Sync Interval: 15 minutes (configurable)
Batch Processing Flow
Batch Processing Flow
Loading diagram...
Storage Structure
Your data is organized using Hive Partitioning, which makes querying efficient and cost-effective.
output/
└── station={UUID}/
└── year={YYYY}/
└── month={MM}/
└── day={DD}/
├── data_0900.parquet (15-min batch)
├── data_0915.parquet
└── ...