Performance Benchmarks
Proof of the pudding for scalability of Obsrv
Note: This is a work in progress page. Following results are from initial benchmarks. Detailed benchmarks will be added shortly once the benchmark exercise is completed
Cluster Size
Number of Nodes
4
Node Size
4 core, 16 Gb
PV size
1 TB
Installation Mode
Obsrv with monitoring and real-time storage
Processing Benchmarks
Processing benchmark is independent on number of datasets created, hence the strategy is to test with volume with all configurations enabled. Disabling any configuration is going to improve throughput
Configuration 1
Dedup turned on
De-normalization configured on 2 master datasets
Transformations configured on 2 fields
Event size of 1 kb
Results
1 CPU, 1GB, 1 task slot, 1 parallelism
~ 13k | 13 Mb
~ 750k | 780Mb
~ 18Million | 18Gb
2 CPU, 2GB, 2 task slot, 2 parallelism
~ 30k | 30 Mb
~ 1.8Million | 1.8Gb
~ 40Million | 40Gb
4 CPU, 4Gb, 4 task slots, 4 parallelism
In Progress
In Progress
In Progress
Note: Many other scenarios with varying flink configurations are under benchmarking and will be updated post completion
Secor Backups Benchmark
To ensure there is no data loss across obsrv pipeline all data is backuped to object store using S3. Following are the benchmark results of Secor backups in real-time
Configuration 1
Total Secor processes - 7
Total CPU Allocated - 1.5 cpu
Event size of 1 kb
Results
~ 1.6 Million | 1.6Mb
~ 100 Million | 100Gb
~ 2.4 Billion | 2.4Tb
~ 300 Million | 300Gb
Note: In DIKSHA we have observed each secor process with 1cpu was able to upload 200Million events (200 Gb) to Azure blob storage
Druid Indexing Benchmark
Druid indexing benchmark is dependent on number of datasets created and number of aggregate tables. This benchmark is done with minimal configuration only and can actually linearly scale with the number of CPUs provided
Minimum Configuration
Process Name
Druid Indexer
CPU
0.5
Direct Memory
2Gi
Heap
9Gi
GlobalIngestionHeap
8Gi
Workers Count
30
Pod Memory
11Gi
Results
1
~ 80k | 80 Mb
~ 4.8 Million | 4.8 Gb
~ 110 Million | 110 Gb
2
~ 40k | 40 Mb
~ 2.4 Million | 2.4 Gb
~ 55 Million | 55 Gb
3
In Progress
In Progress
In Progress
4
In Progress
In Progress
In Progress
5
~ 35k | 35 Mb
~ 2.1 Million | 2.1 Gb
~ 50 Million | 50 Gb
Note: How does the indexing scale when more cpu resources are provided will be added once the benchmark is complete
Query Benchmark
Similar to processing, query benchmark is dependent on the volume of data but not on the number of datasets (or tables) created. Query performance will increase linearly with the amount of CPU/Memory assigned to the Druid Historical process
Minimum Configuration
Process Name
Druid Historical
CPU
2
Direct Memory
4608Mi
Heap
1Gi
Pod Memory
5700Mi
Segment Size
4.77Gi
No. of rows per segment
5000000
processing.numThreads
2
processing.numMergeBuffers
6
Concurrency
100
RAW Table Results
Group by on Raw Data
1 Day
25 r/s
Avg | Min | Max | 90th
392 | 80 | 686 | 472
Group by on Raw Data
7 Days
4 r/s
Avg | Min | Max | 90th
4933 | 1277 | 8382 | 5154
Group by on Raw Data
30 Days
In Progress
In Progress
Aggregate (Rollup) Table Results
Group by on Aggregate Data
1 Day
In Progress
In Progress
Group by on Aggregate Data
7 Days
In Progress
In Progress
Group by on Aggregate Data
30 Days
In Progress
In Progress
Note: Multiple query types with varying interval and historical configuration combinations are being benchmarked actively and results will be updated once the activity is completed.
Last updated