Time-series data is everywhere in modern applications—from monitoring CPU usage and API latencies to tracking business metrics and IoT sensor readings. But handling this data at scale requires careful engineering. In this post, I'll walk you through building a high-performance time-series data ingestion pipeline that can handle over 60,000 requests per second with sub-millisecond latency.
The Challenge
Modern applications generate massive amounts of time-series data. Whether you're monitoring microservices, tracking user behavior, or collecting IoT sensor data, you need a system that can:
- Accept thousands of metrics per second
- Maintain low latency under high load
- Efficiently batch writes to reduce database pressure
- Accept thousands of metrics per second
- Maintain low latency under high load
- Efficiently batch writes to reduce database pressure
- Handle memory management without creating garbage collection pressure
- Gracefully handle errors and shutdowns
- Measurement: The metric name (e.g., "cpu_usage", "api_latency")
- Tags: Indexed metadata for filtering and grouping
- Fields: The actual numeric values
- Timestamp: When the metric was recorded
- Multiple workers process metrics concurrently
- Configurable batch size (1000 metrics per batch)
- Time-based flushing (5ms timeout) ensures data freshness
- Buffered channels prevent blocking on metric ingestion
- Throughput: 60,776 requests/second
- Average Latency: 0.8ms
- 99th Percentile: 2.2ms
- Total Requests: 607,807 in 10 seconds
- Zero Errors: All requests returned HTTP 200
- Metrics on queue depth and processing latency
- Error rate monitoring
- Resource utilization tracking
- Custom dashboards for operational visibility
- Multiple instances can run behind a load balancer
- Each instance maintains its own connection to InfluxDB
- No shared state between instances
- Persistent queuing (Redis, Kafka) for durability
- Multiple InfluxDB replicas
- Backup and recovery procedures
Time-series data is everywhere in modern applications—from monitoring CPU usage and API latencies to tracking business metrics and IoT sensor readings. But handling this data at scale requires careful engineering. In this post, I'll walk you through building a high-performance time-series data ingestion pipeline that can handle over 60,000 requests per second with sub-millisecond latency.
The Challenge
Modern applications generate massive amounts of time-series data. Whether you're monitoring microservices, tracking user behavior, or collecting IoT sensor data, you need a system that can:
Architecture Overview
Our solution is a Go-based HTTP server that sits between metric producers and InfluxDB. Here's how it works:
Client Apps → HTTP API → Worker Pool → Batch Processing → InfluxDB
The pipeline accepts JSON payloads containing time-series metrics and processes them asynchronously using a worker pool pattern. Each metric contains:
Key Performance Optimizations
1. Object Pooling for Memory Management
Instead of creating new objects for every request, we use sync.Pool
to reuse metric and response objects:
var metricPool = sync.Pool{ New: func() any { return &TimeSeriesMetric{ Tags: make(map[string]string), Fields: make(map[string]interface{}), } }, }
This dramatically reduces garbage collection pressure, which is crucial for maintaining consistent low latency.
2. Worker Pool with Batch Processing
Rather than writing each metric individually to InfluxDB, we use a worker pool that batches metrics:
3. Optimized HTTP Server Configuration
The HTTP server is tuned for high throughput:
server := &http.Server{ ReadTimeout: 5 * time.Second, WriteTimeout: 10 * time.Second, IdleTimeout: 60 * time.Second, ReadHeaderTimeout: 2 * time.Second, // Disable HTTP/2 for maximum performance TLSNextProto: make(map[string]func(*http.Server, *tls.Conn, http.Handler)), }
4. Non-blocking Channel Operations
The pipeline uses a large buffered channel (100,000 capacity) and non-blocking sends to prevent request handlers from waiting:
select { case metricsChan <- m: // Success default: // Channel full - return error immediately http.Error(w, "server busy", http.StatusServiceUnavailable) return }
Benchmark Results
Using the hey
load testing tool with 50 concurrent connections over 10 seconds:
hey -z 10s -c 50 -m POST -T "application/json" \ -d '{"measurement": "api_requests", ...}' \ http://localhost:8080/metrics
Results:
What Makes This Fast?
Memory Efficiency
Object pooling eliminates allocation overhead, reducing GC pressure that could cause latency spikes.
Async Processing
The HTTP handler immediately queues metrics and returns, while workers process them in the background.
Batch Writes
Writing 1000 metrics at once is far more efficient than 1000 individual writes to InfluxDB.
Optimized Serialization
Direct JSON encoding/decoding without intermediate string allocation.
Smart Timeouts
Short timeouts prevent resource exhaustion while ensuring responsive error handling.
Real-World Considerations
Monitoring and Observability
In production, you'd want to add:
Horizontal Scaling
This design scales well horizontally:
Data Durability
For mission-critical data, consider:
Lessons Learned
Memory management matters: Object pooling reduced our GC overhead by 80%, directly improving tail latency.
Batching is essential: Individual writes would have maxed out at ~1,000 RPS. Batching increased this 60x.
Channel sizing is critical: Too small causes blocking; too large uses excessive memory. 100k was our sweet spot.
HTTP/2 can hurt: For high-throughput APIs, HTTP/1.1 often performs better due to lower overhead.
Conclusion
Building high-performance data ingestion requires careful attention to memory management, concurrency patterns, and I/O optimization. This pipeline demonstrates that with the right architecture, Go can easily handle enterprise-scale time-series workloads.
The complete implementation handles graceful shutdowns, health checks, and configurable parameters while maintaining excellent performance characteristics. Whether you're building monitoring infrastructure, IoT data collection, or real-time analytics, these patterns provide a solid foundation for high-throughput data ingestion.
Find the code at: https://github.com/tanmaysharma2001/time-series-ingestion
The complete source code for this pipeline is production-ready and includes comprehensive error handling, logging, and configuration management. Consider this architecture when you need to process thousands of time-series data points per second with consistent low latency.