Skip to content

Consider taking into account collection order in topological sort when entities with generated keys are inserted #15585

@divega

Description

@divega

This is based on a dicussion with a customer that is porting a large codebase from LINQ to SQL to EF Core. In the customer's own words:

  • There are a number of tables that currently rely on inserting records in the order of the principal's collection navigation property (which was the LINQ to SQL behaviour).
  • In the medium term these tables should probably be modified to include an OrderIndex column or similar, in which case Ordered Collection Navigation Properties (aka Indexed Collections) #9067 may be of interest to us.
  • In the short term we've worked around this by special casing some of the tables and forcing records to be inserted one at a time, in the expected order.
  • We're considering "genericising" this workaround, such that we mimic the LINQ to SQL insert order. Obviously this gets more complex when taking object dependencies into account, and we'd be interested in hearing how we might be able to take advantage of the EF Core infrastructure to support this.
  • We'd also be interested in knowing whether it'd be possible to make the EF ordering a behaviour that we could specify. e.g. EF Core needs to determine the correct dependency ordering, but after that, the insert order can be arbitrary. Would it be possible for us to contribute our input into that "arbitrariness"? I'm thinking effectively a Func<IEnumerable<object>, Func<IEnumerable<object>>, such that we can re-order as we see fit.

I discussed this with @AndriySvyryd and although currently the secondary order used is the primary key of the element of the collection, for the insert scenario in which the actual primary key hasn't been generated yet, we could use the collection order instead. That said, at the stage in the update pipeline where we are performing the topological sort, all we have is entity entries and we currently don't have access to the collection, so we it would require some refactoring that makes the feature more expensive.

The main argument for not doing this is that after the initial insertion there is no way to maintain this order, and reliance on the order can easily become a pit of failure. The proper way to achieve this would be to actually implement support for sorted collections as describe in #9067. However implementing this behavior should still be less costly than that.

If we convince ourselves that this is a good behavior to add, we should also discuss if we could implement another behavior that has been asked for in issues like #11686. That one is not about taking into account collection order, but the order in which entities were added to the DbContext for inserts with generated keys. It is possible that the implementaiton could be similar.

A couple of alternative implementatoins we discussed:

  1. Add a replaceable service to obtain the secondary order for a given entry. An implemetnation of this service could look up elements in the collection to use their position.
  2. Somehow preserve the order in which entires got created on the DbContext (and make sure when we scan collections looking for new entities we add them in order too) and then use that as a fallback secondary order. This could be implemented by either adding a serial number to entries o storing them in ordered collections and usign order preserving sorting algorithms. Note that if you change the order of elements in the collection before you save, you would still get the entities in the order in which they were originally observed.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions