opensensor.space

Last refreshed 54 seconds ago

opensensor.space cloud-native architecture

Quick Station Access

Environmental Monitoring Dashboards

From Traditional IoT to Cloud-Native

Traditional IoT architectures typically rely on local hubs - a sensor device sending data via MQTT or HTTP to a central server (often running databases like InfluxDB or TimescaleDB). This works, but introduces complexity: multiple layers of infrastructure, potential data loss during outages, and scaling challenges as sensor networks grow.

Our First Trial: The Traditional Approach

In early 2024, we built our first environmental monitoring station following the conventional approach - a Raspberry Pi Zero W with an Enviro+ sensor connected to an Intel NUC that served as a data hub. The NUC collected sensor data and stored it in a TimeScaleDB database. It worked, but it wasn't as elegant or efficient as it could be. If the hub disconnected for any reason - like a power outage - all the readings and measurements sent via MQTT from the sensors would be lost.

After a year of experimenting with cloud-native technologies, we had a realization: Why use extra energy and resources when the Raspberry Pi Zero W already has WiFi?

This insight became the foundation of opensensor.space - eliminating unnecessary infrastructure by leveraging what edge devices already have: network connectivity and storage capabilities.

Instead of managing databases and message brokers, sensors write directly to cloud object storage in open formats. This trial taught us that simplicity scales better than complexity.

opensensor.space cloud-native architecture diagram

Architecture Overview

This reference implementation uses environmental sensors, but the pattern works for any IoT data source. Explore the data structure using DuckDB:

SUMMARIZE 
    SELECT 
        * 
    FROM 
        read_parquet('s3://us-west-2.opendata.source.coop/youssef-harby/weather-station-realtime-parquet/1m_avg_daily/station=01/**/*.parquet', union_by_name=true, hive_partitioning=true);

Sensor Data Statistics

Min, max, average, and other key metrics for all sensor readings

No Results

Instead of relying on local database servers, edge devices stream sensor measurements directly to cloud storage in Parquet format. This approach eliminates unnecessary infrastructure while maintaining full functionality and enabling massive scale.

How It Works

The reference implementation:

  1. A Python script runs as a cron job every 5 minutes on the Raspberry Pi
  2. Each Parquet file contains 1-second interval measurements from the Enviro+ sensor
  3. Files are stored in a partitioned format for near real-time dashboarding:
    station={STATION_ID}/year={year}/month={month}/day={day}/data_{time}.parquet
  4. A GitHub Actions workflow runs daily to aggregate the small 5-minute files into consolidated daily files, making queries more efficient and reducing the number of small files
  5. Data is queried directly using DuckDB-wasm in the browser with Evidence.dev, creating a truly cloud-native dashboard without any intermediate servers

All Parquet files are hosted on Source Cooperative, a Radiant Earth initiative that provides free S3-compatible object storage for open datasets. This allows us to share sensor data openly while avoiding storage costs.

Example: Environmental Monitoring

This reference deployment uses the Enviro+ sensor pack to demonstrate the platform's capabilities with environmental data:

  • Temperature
  • Pressure
  • Humidity
  • Oxidized gases
  • Reducing gases
  • NH3 (ammonia)
  • Light levels (lux)
  • Proximity
  • Particulate Matter (PM1.0, PM2.5, and PM10)

You can find more information about the Enviro+ here:

Platform Benefits

opensensor.space demonstrates how IoT devices can participate in cloud-native architectures without complex infrastructure. By storing data in open formats like Parquet and using client-side processing with tools like DuckDB and Evidence, sensor networks become:

  • Minimum carbon footprint - Edge processing reduces data transmission by 60-90%, significantly cutting energy consumption and CO2 emissions compared to continuous cloud streaming. With data centers projected to consume 33% of ICT industry electricity by 2025, edge-first architectures are critical for sustainable IoT
  • Cost effective - No databases, message brokers, or backend servers to manage
  • Infinitely scalable - Object storage scales from single sensors to millions
  • Hardware agnostic - Works with Raspberry Pi, ESP32, Arduino, or any device that can write files
  • Resilient - Offline-first architecture with automatic sync when connectivity returns
  • Open - Standard formats (Parquet, S3) accessible to any analytics tool
  • Energy efficient - Minimal computational overhead on edge devices, ideal for battery or solar-powered deployments

Sensors operate autonomously even when offline. When internet connectivity is unavailable, the system continues collecting and storing readings locally in Parquet files. Once connection is restored, accumulated data automatically synchronizes to cloud object storage - ensuring zero data loss during network outages.

Roadmap

We're expanding opensensor.space to support diverse sensor deployments and use cases:

Mobile & Edge Deployments

  1. LoRa mesh networks - Multiple sensors communicate via LoRa modules with one designated gateway sensor responsible for WiFi sync to cloud storage. Recent research shows this architecture increases packet delivery ratios to 73.78% for sensors beyond 5.8km from the gateway, enabling large-area coverage with minimal infrastructure
  2. GPS-enabled sensors - Location tracking for mobile sensor networks
  3. Vehicle-mounted units - Moving sensors for spatial data collection (air quality mapping, traffic analysis, etc.)
  4. Offline-first edge devices - Collect data continuously, sync when connectivity returns
  5. Ultra-low-power sensors - ESP32/LoRaWAN implementations optimized for battery or solar power, minimizing carbon footprint

Platform Enhancements

  • Multi-sensor support - Reference implementations for industrial sensors, agricultural IoT, smart city applications
  • Adaptive sampling - Intelligent data collection based on event triggers or anomaly detection
  • Client-side analytics - Advanced DuckDB-wasm visualizations and real-time processing
  • Edge ML - TinyML integration for on-device inference before cloud sync

Developer Experience

  • SDK libraries - Python, JavaScript, and Rust libraries for easy integration
  • Terraform modules - Infrastructure-as-code for cloud storage setup
  • Docker containers - Containerized edge runtimes for easy deployment

This dashboard is built with Evidence, which allows us to query and visualize the Parquet data directly in the browser - no backend required!

Resources

Research & Further Reading

LoRa Mesh Networks:

Edge Computing & Sustainability:

We'd love to hear your feedback and suggestions on building sustainable, scalable sensor networks!