Skip to content

FailoverClient may have unreasonable retry when meeting a redis sentinel master switch #3540

@wyxloading

Description

@wyxloading

Expected Behavior

go-redis v9.14.0

command invocation using a FailoverClient with retry setting like this should retry until master switch operation done and send commands to new master within about 30 seconds. Actually it should cost less than 30s cause the master switch operation done in about 5 seconds.

MaxRetries:      39,
MinRetryBackoff: time.Millisecond,
MaxRetryBackoff: time.Second,

Current Behavior

Some command call will cost more than 30s ( longer than the retry settings ). That we call it unreasonable retry, cause the master switch operation done in about 4-5 seconds in our case. Mostly command call should retry and eventually done in about 5-6s at most.

Possible Solution

https://github.com/redis/go-redis/blob/v9.14.0/redis.go#L353-L388

getConn get an old master connection, before initConn, then master switch occured, that connection would be closed. Then in call initConn would wrap that connection into a SingleConnPool, and wrap into a new baseClient with the same retry settings.

If it call Hello command, it would find out that connection is already closed ( not health connection ), may get a EOF error, which result in shouldRetry true. Then it would do unreasonable retry until attempt times larger than MaxRetries setting.

Maybe we should not retry if we got a badConnError and baseClient.connPool is either SingleConnPool nor StickyConnPool

Context (Environment)

go-redis v9.14.0, with redis 5 sentinel server

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions