UserInfo Endpoint Load Test

Test Overview

Item	Details
Test Date	December 12, 2025
Target Endpoint	`GET /userinfo`
Purpose	Measure maximum throughput for Bearer token validation and user data retrieval

Test Environment

K6 Cloud Configuration

Component	Details
Load Generator	K6 Cloud (amazon:us:portland)
Target	https://conformance.authrim.com
Protocol	Bearer Token (JWT RS256)

Infrastructure

Component	Technology
Compute	Cloudflare Workers
Key Management	Durable Objects (KeyManager)
Database	Cloudflare D1
User Cache	Cloudflare KV (USER_CACHE)

Test Methodology

Scenario

Pre-generate 4,000 valid access tokens stored in R2
Warmup phase: 30s at 50 RPS to activate DOs
Benchmark phase: 3 minutes at target RPS
Each VU picks a random token and calls /userinfo

Load Pattern

{
  scenarios: {
    warmup: {
      executor: 'constant-arrival-rate',
      rate: 50,
      duration: '30s',
      exec: 'warmupScenario',
    },
    userinfo_benchmark: {
      executor: 'ramping-arrival-rate',
      startRate: 0,
      timeUnit: '1s',
      preAllocatedVUs: 2400,
      maxVUs: 3000,
      stages: [
        { target: 1000, duration: '15s' },
        { target: 2000, duration: '180s' },
        { target: 0, duration: '15s' },
      ],
      startTime: '30s',
    },
  },
}

Test Duration

Warmup: 30 seconds at 50 RPS
Benchmark: 3 minutes 30 seconds
Total: ~4 minutes per RPS target

Test Configuration

Authentication Flow

Parameter	Value
Token Type	Bearer (JWT)
Signature Algorithm	RS256
JWK Source	KeyManager DO (cached)
Token Count	4,000 pre-generated

Success Criteria

HTTP 200 status code
sub claim present in response
Zero HTTP failures

Results - Performance Metrics

Summary

RPS	Total Requests	Status
1,000	146,231	✅
2,000	293,947	✅
2,500	365,648	⚠️
3,000	436,456	⚠️

Note: ⚠️ indicates K6 threshold exceeded (P95 > 500ms)

K6 Client Latency (ms)

RPS	P50	Mean	P95	P99	Max
1,000	114	117	139	200	4,523
2,000	118	133	254	350	29,717
2,500	127	174	325	585	5,842
3,000	150	298	1,032	1,736	5,462

Warmup Phase Latency (ms)

RPS Target	Requests	P50	Mean	P95	P99
1,000	~1,500	112	112	133	138
2,000	~1,500	111	114	134	138
2,500	~1,500	112	118	135	488
3,000	~1,500	112	115	135	155

Warmup activates DOs and prevents initial cold start spikes

Results - Infrastructure Metrics

Worker Duration (ms)

RPS	Total	P50	P75	P90	P99	P999
1,000	146,231	13.22	14.14	15.62	31.20	88.22
2,000	293,947	14.11	16.57	24.15	44.54	176.35
2,500	365,648	15.99	27.52	55.91	178.63	668.57
3,000	436,456	17.58	50.69	124.55	231.23	596.17

Worker CPU Time (ms)

RPS	P50	P75	P90	P99	P999
1,000	1.10	1.23	1.50	4.02	5.28
2,000	1.07	1.18	1.42	3.96	4.81
2,500	1.06	1.17	1.43	3.97	4.85
3,000	1.05	1.17	1.44	3.98	4.92

Key Finding: CPU time stable at ~1ms P50 - JWT RS256 verification overhead is minimal

Durable Objects Wall Time (ms)

KeyManager DO for JWK retrieval and caching:

RPS	Total DO Req	P50	P75	P90	P99	P999
1,000	146,324	0.82	1.87	3.38	7.83	89.34
2,000	294,023	0.46	0.74	1.63	6.87	40.59
2,500	352,388	0.40	0.58	1.19	5.41	39.51
3,000	366,986	0.38	0.54	0.94	6.07	58.62

Key Finding: DO wall time improves at higher RPS due to better cache hit rates

D1 Database Metrics

RPS	Read Queries	Write Queries	Rows Read	Rows Written
1,000	525,433	341,182	470,091	1,982,394
2,000	525,821	341,182	470,479	1,982,394
2,500	529,698	341,182	474,356	1,982,394
3,000	528,988	341,182	473,646	1,982,394

Note: Write queries constant due to USER_CACHE KV effectiveness

Capacity Recommendations

Usage	Recommended RPS	Rationale
Normal Operation	≤2,000	K6 P99 < 350ms, CF P99 < 50ms, 0% errors
Peak Handling	≤2,500	K6 P99 < 600ms, CF P99 < 200ms, 0% errors
Absolute Limit	≤3,000	K6 P99 < 2000ms, CF P99 < 250ms, 0% DO errors

Key Findings

1. JWT Verification is Fast

CPU time P99: ~4ms across all RPS levels
V8 WebCrypto + JWK caching makes RS256 verification negligible

2. KeyManager DO is Ultra-Fast

Wall time P99: 5-8ms stable
High cache hit rate at high RPS

3. Worker Queuing is the Bottleneck

Worker Duration P99 rises from 31ms to 231ms as RPS increases
CPU time stays flat - it’s request queuing, not processing

4. Cache is Effective

USER_CACHE (KV) keeps D1 writes constant
Read-through pattern prevents cold cache misses

5. 100% Success Rate at All RPS Levels

Zero HTTP failures even at 3,000 RPS
System remains reliable even when overloaded

Comparison with Silent Auth

Endpoint	Recommended	Peak	Limit
Silent Auth	2,000 RPS	3,000 RPS	4,000 RPS
UserInfo	2,000 RPS	2,500 RPS	3,000 RPS

UserInfo has lower throughput due to:

JWT verification overhead
D1 reads for user data

Performance Visualization

K6 Client P99 Latency

xychart-beta
    title "RPS vs K6 P99 Latency (ms)"
    x-axis [1000, 2000, 2500, 3000]
    y-axis "Latency (ms)" 0 --> 2000
    bar [200, 350, 585, 1736]

RPS	K6 P99	Status
1,000	200ms	✅ Good
2,000	350ms	✅ Acceptable
2,500	585ms	⚠️ High
3,000	1,736ms	❌ At limit

Worker Duration P99

xychart-beta
    title "RPS vs CF Worker Duration P99 (ms)"
    x-axis [1000, 2000, 2500, 3000]
    y-axis "Duration (ms)" 0 --> 250
    bar [31, 45, 179, 231]

Load Degradation Pattern

Phase	1000 RPS	2000 RPS	2500 RPS	3000 RPS
Stable	~114ms	~115ms	~127ms	~150ms
Late	~114ms	~177ms	~313ms	~573ms
End	~114ms	~111ms	~112ms	~113ms

Note: Queuing accumulates over time at high RPS, but normalizes immediately after ramp-down.

Bottleneck Analysis

Layer	1000-2000 RPS	2500 RPS	3000 RPS
K6 Client P99	200-350ms	585ms	1,736ms
Worker CPU	Stable (1-4ms)	Stable	Stable
Worker Duration	Stable (15-45ms)	Rising (55-179ms)	At limit (125-231ms)
DO Wall Time	Stable (1-8ms)	Stable	Stable
Verdict	Headroom	At load limit	Performance degradation

Test Execution Details

K6 Cloud Run URLs

RPS	K6 Cloud URL	Time (JST)
1,000	https://authrim.grafana.net/a/k6-app/runs/6301049	01:08
2,500	https://authrim.grafana.net/a/k6-app/runs/6301165	01:40
3,000	https://authrim.grafana.net/a/k6-app/runs/6301118	01:30

Conclusion

Authrim’s UserInfo endpoint achieves:

Up to 2,000 RPS: High quality response (K6 P99 < 350ms, CF P99 < 50ms)
Up to 2,500 RPS: Acceptable range (K6 P99 < 600ms, CF P99 < 200ms)
3,000+ RPS: Visible degradation (K6 P99 > 1,700ms)

Durable Objects (KeyManager) are fast - the bottleneck is Cloudflare Workers request queuing. Further scale-out requires multi-region deployment or Worker distribution.

100% success rate achieved at all RPS levels - reliability is maintained even at throughput limits.