Skip to content

AAE FullSync repl fails for small ring/few keys experiment #778

@russelldb

Description

@russelldb

I don't have loads to go on here. There's a test here https://github.com/nhs-riak/riak_test/blob/bug/rdb/riak_repl-778/tests/repl_aae_fail.erl

When running repl fullsync with a small number of keys, on this small ring (8) with >2 workers there are keys that are not replicated. I don't know if a "small number of keys" is pertinent.

It may be that the more workers, the more keys are missed. For example change the settings to ring_size=8, workers=20 and many keys won't be replicated. With ring_size=8, workers=8 then usually in the test only one or two (often 9) are not replicated. With ring_size=8, workers=2 then replication "usually" succeeds. Even with workers=2 I saw one failure.

I didn't try more normal/typical ring_sizes with a high number of workers.

Maybe that a quickcheck property might be handy for.

This looks like an odd edge case, and maybe needs no more attention than an amendment to documentation. However, since there is a working riak_test that is easy to tweak and run, if someone wants to look into this, have it!

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions