Photo by Athanasios Papazacharias on Unsplash
I like tracking things... airplanes, cars, devices, and even Santa. Today, we’re going to learn how to track vessels using Python to collect and process the information, push it into Arc (our high-performance time-series database), and create a dashboard to visualize the collected data using Grafana.
The Challenge
Vessel tracking generates massive amounts of time-series data - position updates, speed changes, heading adjustments - all streaming in real-time from thousands of ships…
Photo by Athanasios Papazacharias on Unsplash
I like tracking things... airplanes, cars, devices, and even Santa. Today, we’re going to learn how to track vessels using Python to collect and process the information, push it into Arc (our high-performance time-series database), and create a dashboard to visualize the collected data using Grafana.
The Challenge
Vessel tracking generates massive amounts of time-series data - position updates, speed changes, heading adjustments - all streaming in real-time from thousands of ships worldwide. This is exactly the type of high-cardinality, high-throughput workload that Arc was built for.
We’ll be using AISStream.io, which provides live vessel data via WebSocket. This data includes position, speed, heading, and navigational status - perfect for demonstrating Arc’s capabilities with Industrial IoT workloads.
Tracking Choice
I chose to track vessels in two distinct locations: the Port of Miami, Florida, and the Port of San Francisco, California. These ports were selected due to their high activity levels - Miami is one of the world’s busiest cruise ports and a major cargo hub, while San Francisco Bay handles massive container ship traffic.
Setting Up the Infrastructure
We’re going to use Arc as our time-series database. If you’re not familiar with Arc, it’s a high-performance time-series database built for billion-record Industrial IoT workloads. It delivers 4.21M records/sec sustained throughput and stores data in portable Parquet files you own.
For visualization, we’ll use Grafana with the Arc datasource plugin, which uses Apache Arrow for high-performance data transfer.
Let’s run everything on localhost using Docker. My docker-compose.yml file looks like this:
version: '3.8'
services:
arc:
image: ghcr.io/basekick-labs/arc:25.11.1
container_name: arc
ports:
- "8000:8000" # Arc HTTP API
volumes:
- arc-data:/app/data
environment:
- STORAGE_BACKEND=local
restart: unless-stopped
grafana:
image: grafana/grafana:latest
container_name: grafana
ports:
- "3000:3000"
environment:
- GF_SECURITY_ADMIN_USER=admin
- GF_SECURITY_ADMIN_PASSWORD=admin
- GF_PLUGINS_ALLOW_LOADING_UNSIGNED_PLUGINS=basekick-arc-datasource
volumes:
- grafana-data:/var/lib/grafana
depends_on:
- arc
restart: unless-stopped
volumes:
arc-data:
grafana-data:
A few important things about this setup:
- Data persistence: Both Arc and Grafana persist data in Docker volumes
- Grafana: We’ll install the Arc datasource plugin after Grafana starts
For production deployments, use Docker secrets or environment files instead of hardcoded credentials.
Once you’ve customized the file, start the services:
docker compose up -d
Get Your Arc Admin Token
On first startup, Arc generates an admin API token. You need to capture this from the logs:
docker logs arc
Look for the token in the output:
======================================================================
FIRST RUN - INITIAL ADMIN TOKEN GENERATED
======================================================================
Initial admin API token: ...............................QfT5rVhLCewKA
======================================================================
SAVE THIS TOKEN! It will not be shown again.
Use this token to login to the web UI or API.
You can create additional tokens after logging in.
======================================================================
Save this token - you’ll need it for API calls and Grafana configuration!
Verify the containers are running:
docker ps
You should see:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
b25f21e653f4 grafana/grafana:latest "/run.sh" 20 minutes ago Up 20 minutes 0.0.0.0:3000->3000/tcp grafana
f4f5ae585632 ghcr.io/basekick-labs/arc:25.11.1 "/app/arc" 20 minutes ago Up 20 minutes 0.0.0.0:8000->8000/tcp arc
Shaping and Pushing Data
Now for the fun part - collecting vessel data and streaming it to Arc. We’ll use Python with Arc’s HTTP API.
Note: Arc creates databases automatically when you first write data to them. Databases in Arc are namespaces that organize your tables - no need to create them explicitly!
First, install the required dependencies:
pip3 install websockets requests msgpack
Here’s the Python code to stream AIS data into Arc using MessagePack columnar format:
import asyncio
import websockets
import json
from datetime import datetime, timezone
import requests
import msgpack
import os
# Arc configuration
ARC_URL = "http://localhost:8000"
ARC_TOKEN = os.environ.get("ARC_TOKEN", "your-secure-token-here")
AIS_API_KEY = os.environ.get("AISAPIKEY")
def send_to_arc(batch_data):
"""Send batch of data points to Arc using MessagePack columnar format"""
# Transform row data into columnar format
if not batch_data:
return True
# Columnar format - arrange data by columns for optimal performance
data = {
"m": "ais_data", # measurement/table name
"columns": {
"time": [int(d["timestamp"].timestamp() * 1000) for d in batch_data],
"ship_id": [d["ship_id"] for d in batch_data],
"latitude": [d["latitude"] for d in batch_data],
"longitude": [d["longitude"] for d in batch_data],
"speed": [d["speed"] for d in batch_data],
"heading": [d["heading"] for d in batch_data],
"nav_status": [d["nav_status"] for d in batch_data]
}
}
try:
response = requests.post(
f"{ARC_URL}/api/v1/write/msgpack",
headers={
"Authorization": f"Bearer {ARC_TOKEN}",
"Content-Type": "application/msgpack",
"x-arc-database": "vessels_tracking" # Specify database via header
},
data=msgpack.packb(data)
)
if response.status_code == 204:
return True
else:
print(f"Error {response.status_code}: {response.text}")
return False
except requests.exceptions.RequestException as e:
print(f"Error writing to Arc: {e}")
return False
async def connect_ais_stream():
"""Connect to AIS stream and push data to Arc"""
async with websockets.connect("wss://stream.aisstream.io/v0/stream") as websocket:
subscribe_message = {
"APIKey": AIS_API_KEY,
"BoundingBoxes": [
# Miami, Florida
[[25.645, -80.345], [25.905, -80.025]],
# San Francisco Bay, California
[[37.45, -122.55], [37.95, -122.25]],
],
"FilterMessageTypes": ["PositionReport"],
}
subscribe_message_json = json.dumps(subscribe_message)
await websocket.send(subscribe_message_json)
batch = []
batch_size = 100 # Arc handles batches efficiently
async for message_json in websocket:
message = json.loads(message_json)
message_type = message["MessageType"]
if message_type == "PositionReport":
ais_message = message["Message"]["PositionReport"]
# Prepare data point for Arc
data_point = {
"timestamp": datetime.now(timezone.utc),
"ship_id": ais_message['UserID'],
"latitude": ais_message['Latitude'],
"longitude": ais_message['Longitude'],
"speed": ais_message['Sog'],
"heading": ais_message['Cog'],
"nav_status": str(ais_message['NavigationalStatus'])
}
batch.append(data_point)
print(f"[{data_point['timestamp'].isoformat()}] ShipId: {data_point['ship_id']} "
f"Lat: {data_point['latitude']:.6f} Lon: {data_point['longitude']:.6f} "
f"Speed: {data_point['speed']} Heading: {data_point['heading']}")
# Send batch when it reaches batch_size
if len(batch) >= batch_size:
if send_to_arc(batch):
print(f"✓ Sent {len(batch)} records to Arc")
batch = []
if __name__ == "__main__":
if not AIS_API_KEY:
print("Error: AISAPIKEY environment variable not set")
print("Sign up at https://aisstream.io to get your API key")
exit(1)
asyncio.run(connect_ais_stream())
Key Points About This Code
- MessagePack Columnar Format: Uses Arc’s high-performance columnar protocol - data organized by columns instead of rows for optimal compression and ingestion speed
- Batching: Collects 100 data points before sending, then transforms to columnar format (Arc handles batches efficiently)
- Database Specification: Database is specified via
x-arc-databaseheader, measurement name (ais_data) goes in themfield - Timestamp Conversion: Converts Python datetime to millisecond Unix timestamp (Arc’s native time format)
- AIS API Key: Sign up at aisstream.io using your GitHub account
- Bounding Boxes: Customize coordinates for your tracking area
Save this as vessel_tracker.py and run it:
export AISAPIKEY="your-ais-api-key"
export ARC_TOKEN="your-secure-token-here"
python3 vessel_tracker.py
If everything works, you’ll see output like this:
[2025-11-19T13:03:59.876760+00:00] ShipId: 368341690 Lat: 37.794628 Lon: -122.318050 Speed: 9.1 Heading: 107.5
[2025-11-19T13:04:01.503668+00:00] ShipId: 368231420 Lat: 37.512495 Lon: -122.195927 Speed: 0 Heading: 360
[2025-11-19T13:04:02.522329+00:00] ShipId: 366999711 Lat: 37.810505 Lon: -122.360678 Speed: 0 Heading: 289
✓ Sent 100 records to Arc
Verifying Data in Arc
Let’s confirm the data is in Arc using SQL:
curl -X POST http://localhost:8000/api/v1/query \
-H "Authorization: Bearer $ARC_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"sql": "SELECT * FROM vessels_tracking.ais_data ORDER BY time DESC LIMIT 10"
}'
You should see your vessel data:
{
"columns": ["timestamp", "ship_id", "latitude", "longitude", "speed", "heading", "nav_status"],
"data": [
["2025-11-19T13:03:59.876Z", 368341690, 37.79463, -122.31805, 9.1, 107.5, "0"],
["2025-11-19T13:04:01.503Z", 368231420, 37.512493, -122.19593, 0, 360, "5"],
...
]
}
Success! Arc is ingesting and storing your vessel tracking data.
Visualizing with Grafana
Now let’s create a real-time dashboard. Go to http://localhost:3000 (username: admin, password: admin).
Install Arc Datasource Plugin
Arc has a native Grafana datasource that uses Apache Arrow for high-performance data transfer. You can either download the latest release or build from source.
Option 1: Download Release (Recommended)
# Download latest release
wget https://github.com/basekick-labs/grafana-arc-datasource/releases/latest/download/basekick-arc-datasource-1.0.0.zip
# Extract and copy to Grafana container
unzip basekick-arc-datasource-1.0.0.zip
docker cp basekick-arc-datasource grafana:/var/lib/grafana/plugins/
# Restart Grafana
docker restart grafana
Option 2: Build from Source
# Clone the repository
git clone https://github.com/basekick-labs/grafana-arc-datasource
cd grafana-arc-datasource
# Install dependencies and build
npm install
npm run build
# Copy to Grafana container
docker cp dist grafana:/var/lib/grafana/plugins/basekick-arc-datasource
# Restart Grafana
docker restart grafana
Wait a minute for Grafana to restart, then verify the plugin is loaded by checking the Grafana logs:
docker logs grafana | grep -i arc
Add Arc Data Source
Now configure the Arc datasource:
- Go to Configuration → Data sources
- Click Add data source
- Search for Arc and select it
- Configure the connection:
- URL:
http://arc:8000(use the container name since we’re in Docker) - API Key: Your Arc authentication token (from the logs earlier)
- Database:
vessels_tracking
- Click Save & Test
You should see “Data source is working” ✓
Create Your First Visualization
- Click Explore view
- Enter this SQL query:
SELECT
time,
ship_id,
latitude,
longitude,
speed,
heading,
nav_status
FROM vessels_tracking.ais_data
WHERE $__timeFilter(time)
ORDER BY time DESC
LIMIT 1000
Note: The $__timeFilter() macro automatically adds the time range from Grafana’s time picker.
- Click Run query
You’ll see a table view of your data. Now let’s make it visual:
Create a Geomap
- Click Add to dashboard → Open Dashboard
- Click Edit on the panel
- In the Visualization dropdown, select Geomap
- Important: In the query editor, change Format from “Time series” to “Table”
- This tells Grafana to treat each row as a single data point rather than creating multiple time series
- Without this, you’ll see cluttered tooltips with many series
You’ll now see vessel positions plotted on a map! Zoom into San Francisco Bay or Miami to see the details:
![]()
Click on any red dot to see detailed vessel information including speed, heading, and navigational status.
Why Arc for Vessel Tracking?
This use case demonstrates several of Arc’s strengths:
- High throughput: Arc handles 4.21M records/sec sustained - perfect for thousands of vessels reporting positions
- Time-series optimized: Automatic time-based partitioning for fast queries
- DuckDB SQL: Full analytical SQL support (window functions, CTEs, geo queries)
- Portable storage: Data stored in Parquet files you can query with any tool
- Low resource usage: Runs efficiently on modest hardware
Arc’s Performance at Scale
This vessel tracking workload demonstrates Arc’s real-world Industrial IoT capabilities. With approximately 100,000 vessels globally reporting position updates every 10 seconds, that’s over 10,000 updates per second - exactly the type of high-cardinality, high-throughput scenario Arc was designed for.
Arc handles this workload effortlessly on a single node:
- Sustained ingestion: 4.21M records/sec capacity means plenty of headroom for growth
- Real-time queries: Sub-second response times even as your dataset grows to billions of records
- Efficient storage: Parquet compression reduces storage costs by 3-5x while maintaining query performance
- Scalable architecture: Add more vessels, ports, or data points without infrastructure changes
The same architecture scales from tracking a few dozen vessels in a single port to monitoring global maritime traffic across all major shipping lanes.
To Conclude
What a fun project! In this article, you learned how to:
- Track vessels using the AisStream API
- Stream real-time data with Python WebSockets
- Deploy Arc and Grafana with Docker
- Ingest time-series data into Arc
- Visualize vessel movements with Grafana’s Geomap
The same patterns apply to any Industrial IoT use case: fleet tracking, equipment telemetry, smart city sensors, or medical device monitoring.
Have you tried it? Let me know on Twitter or LinkedIn about your results.
Want to learn more about Arc? Check out the documentation or join our Discord.