Skip to content

Refresh Token Load Test

Test Overview

ItemDetails
Test DateDecember 2025
Target EndpointPOST /token (grant_type=refresh_token)
PurposeMeasure refresh token rotation performance with theft detection

Test Environment

Infrastructure Configuration

ComponentTechnologyDescription
WorkerCloudflare WorkersOAuth2/OIDC endpoint processing
Durable ObjectsCloudflare DORefreshTokenRotator (token rotation management)
DatabaseCloudflare D1User info, sessions, audit logs
CacheCloudflare KVJWK cache, RBAC claim cache, user cache

Architecture

flowchart TB
    subgraph Local["Test Environment"]
        k6["k6 OSS / K6 Cloud"]
    end

    subgraph CF["Cloudflare"]
        Edge["Cloudflare Edge"]

        subgraph Worker["op-token Worker"]
            W["Token Processing + JWT Signing"]
        end

        subgraph Cache["Cache Layer"]
            KV["KV: JWK / RBAC / User Cache"]
        end

        subgraph DO["Durable Objects"]
            RTR["RefreshTokenRotator"]
        end

        subgraph DB["Database"]
            D1["D1: sessions / audit_logs"]
        end
    end

    k6 -->|HTTPS| Edge
    Edge --> W
    W --> KV
    W --> RTR
    W --> D1
    RTR --> D1

Test Configuration

SettingValueProduction Recommended
REFRESH_TOKEN_ROTATION_ENABLEDtruetrue
REFRESH_TOKEN_EXPIRY30 days30 days
ACCESS_TOKEN_EXPIRY1 hour1 hour
RBAC_CACHE_TTL5 min5 min
USER_CACHE_TTL1 hour1 hour

Test Methodology

Scenario: Refresh Token Storm

Simulates production refresh token rotation behavior:

ItemSetting
Token RotationEnabled (new refresh token issued each time)
VU DesignIndependent token family per VU
Test PatternNormal rotation path only (no error cases)
Think Time0ms (continuous requests)

TokenFamilyV2 Design

Version-based theft detection:

{
"sub": "user_id",
"client_id": "client_id",
"rtv": 5, // Refresh Token Version
"jti": "unique_id",
"exp": 1735689600
}
  • rtv (Refresh Token Version): Version number within token family
  • Old version token use → Invalidate all tokens as theft
  • State management in Durable Objects, audit persistence in D1

Preset Configuration

PresetTarget RPSDurationMax VUUse Case
rps100100 RPS2 min120Production baseline
rps200200 RPS2 min240High traffic scenario
rps300300 RPS2 min360Peak load validation

Results - 200 RPS Test

Execution Date: December 3, 2025 09:33 JST

K6 Metrics

MetricValue
Total Requests29,186
Success Rate100%
Token Rotation Success100%
Errors0

Cloudflare Analytics

MetricValueNotes
Worker Duration P509.35 msMedian
Worker Duration P7510.44 ms
Worker Duration P9039.30 ms
Worker Duration P99816.24 msTail latency
CPU Time P504.80 ms
CPU Time P9914.40 ms
DO Wall Time P509.16 msDurable Objects processing
DO Wall Time P9918.43 ms
D1 Reads10,5100.36/request
D1 Writes23,5180.81/request

DO and D1 Efficiency

MetricCalculated ValueDescription
DO Requests/Worker Request3.09Subrequest efficiency
D1 Reads/Request0.36RBAC cache hit rate > 95%
D1 Writes/Request0.81Audit log and session updates

Results - 3000 RPS Test with Sharding

Sharding Impact

ShardsDO P99DO ErrorsHTTP Failures
32781ms11,972Many
4843ms00

Dramatic improvement:

  • DO P99: 781ms → 43ms (95% reduction)
  • DO Errors: 11,972 → 0 (100% elimination)

Before vs After (3,000 RPS)

Metric32 Shards48 ShardsImprovement
Worker P5012ms12msSame
Worker P95100ms39ms-61%
Worker P99781ms43ms-94%
DO Errors11,9720-100%
Success Rate~96%100%

Optimization History

D1 Read Query Reduction

Optimization StageD1 Reads/RequestImprovement
V1 (no cache)~14.6Baseline
V2 (before RBAC cache)9.7-34%
V2 (after RBAC cache)0.36-96%

Optimizations Applied

DateOptimizationEffect
2025-12-01TokenFamilyV2 (version-based theft detection)Reduced DO storage I/O
2025-12-01UserCache (KV Read-Through)Reduced D1 user queries
2025-12-03Async audit logging (Fire-and-Forget)Reduced response latency
2025-12-03RBAC claim cache (5-min TTL)96% reduction in D1 RBAC queries

Cache Strategy

CacheTTLPurpose
USER_CACHE1 hourUser information (Read-Through)
REBAC_CACHE5 minRBAC claims (roles, permissions, groups)
CLIENTS_CACHE1 hourClient information
KeyManager5 minJWK signing keys (Worker memory)

Capacity Recommendations

UsageRecommended RPSRationale
Normal Operation≤200P99 < 500ms maintained
Peak Handling≤2,500With 48 shards
Absolute Limit≤3,000Zero errors with 48 shards

MAU Conversion

RPSToken Issuance/HourToken Issuance/DayEstimated MAU
100360,0008.6M200K-400K
200720,00017.3M500K-1M
3001,080,00025.9M1M-2M

Conversion Formula:

RPS = (MAU × DAU rate × Requests/DAU) / (Operating hours × 3600) × Peak coefficient
≈ MAU / 5,000

Key Findings

1. Sharding is Critical

48 shards completely eliminated errors at 3,000 RPS while 32 shards had 11,972 DO errors.

2. RBAC Caching Reduces D1 Load by 96%

D1 reads dropped from 14.6/request to 0.36/request.

3. Token Rotation is Reliable

100% success rate at 200 RPS with rotation enabled.

4. Worker is Efficient

CPU time remains at 4-15ms even under load.

Performance Targets

MetricTargetResultStatus
Success Rate> 99.9%100%
Token Rotation> 99%100%
Worker Duration P99< 1000ms816ms / 43ms
DO Wall Time P99< 100ms18.43ms / 43ms
D1 Reads/Request< 50.36

Scale Recommendations

Conservative Estimate

  • 200 RPS with P99 < 500ms
  • Suitable for 500K-1M MAU

Optimistic Estimate

  • 300-400 RPS (current architecture limit)
  • 2,500-3,000 RPS with 48 shards (production tested)

Future Improvements

  1. DO Sharding Extension: Shard RefreshTokenRotator by client_id + user_id
  2. D1 Read Replicas: Read optimization for global deployment
  3. Cloudflare Queues: Async batch processing for audit logs

Conclusion

Authrim’s Refresh Token endpoint with rotation achieves:

  • 100% success rate at 200 RPS
  • 100% token rotation success
  • 0.36 D1 reads per request (96% reduction via caching)
  • Zero errors at 3,000 RPS with 48 shards

Key Takeaway: Proper sharding (48 shards) transforms the system from unstable (11,972 errors) to perfectly reliable (0 errors) at 3,000 RPS.

TokenFamilyV2 design provides both security (theft detection) and performance (version-based state management).