#1398 ✓resolved
Xavier Shay

[BUG] Composite primary key for join models order is incorrect

Reported by Xavier Shay | September 4th, 2010 @ 03:43 PM | in 1.1

For a join model, the order of the primary key is defined by the load order of the parent models, not the order of the belongs_to definitions. This is unexpected, and pretty much indeterminate if you're autoloading models (Rails).

This is a problem for code like:

UserGroup.get(params[:user_id], params[:group_id])

The work around if you are using autoload, is to force the load of the first model at the top of the second model's file:

User # Just referencing the class name will force a load
class Group
  # ...

This script should show the bug on dm-core 1.0.0, and also on edge c3eacc067d2aa90c5e147ae9e799ab37955d138a

require 'rubygems'
require 'dm-core'

class Group
  include DataMapper::Resource

  property :id, Serial

  has n, :user_groups

class User
  include DataMapper::Resource

  property :id, Serial

  has n, :user_groups

class UserGroup
  include DataMapper::Resource
  belongs_to :user, :key => true
  belongs_to :group, :key => true


puts "This should be [user_id, group_id]"
puts UserGroup.key.map(&:name).inspect

Comments and changes to this ticket

  • Xavier Shay

    Xavier Shay September 4th, 2010 @ 03:55 PM

    • Tag changed from dm-core bug to bug, dm-core
  • Aldo Escudero

    Aldo Escudero September 4th, 2010 @ 03:57 PM

    • Tag changed from bug, dm-core to dm-core bug

    Same issue.

    ree-1.8.7-2010.02 > puts UserGroup.key.map(&:name).inspect
    [:group_id, :user_id]

  • Ted Han (knowtheory)

    Ted Han (knowtheory) September 4th, 2010 @ 04:45 PM

    >> puts UserGroup.key.map(&:name).inspect
    [:group_id, :user_id]

    additionally xshay's problem is particularly relevant when using Model.get, when key order matters.

  • Dan Kubb (dkubb)

    Dan Kubb (dkubb) September 8th, 2010 @ 11:59 AM

    In addition to forcing the inferred PK order to be determined by the order of the relationships, I wonder if we should consider changing the #get interface so that it will allow a Hash as an argument.

    While #get works nicely when there is a single PK, it is a bit weird when using CPKs. I think using a Hash might be a nicer API, then again, once you get to that point it's redundant with the #first API. I'm not entirely sure which the best approach is.

  • Xavier Shay

    Xavier Shay September 8th, 2010 @ 05:49 PM

    I reckon keep get the same, as you say you have the option of using first if the code is unclear.

  • Dan Kubb (dkubb)

    Dan Kubb (dkubb) September 9th, 2010 @ 01:01 PM

    @xshay: Agreed.

    Ok, I was thinking about this a bit more last night and I wondered if the way we store relationships is one of the causes of this problem.

    At a high level, when DM models are loaded, all the properties are inserted into a PropertySet object, which retains the order the properties were defined in. The relationships are put into a Hash. When the DataMapper.finalize method is called it walks all the descendants of DM::Model, and then initializes the foreign keys in each relationship, if they haven't been explicitly defined.

    Now, what I think is happening is that since the relationships are in a Hash, the order they are seen is nondeterministic. They could be seen in the order they are defined, in which case everything would work as expected, or they could be seen in reverse/random order, and unexpected things could happen.

    If you're running Ruby 1.9, the order keys are added to the Hash is preserved, so you probably wouldn't notice any issues. However, we still support Ruby 1.8.7, JRuby --1.8 and rbx all of which have unordered hashes; so we have to figure out a solution that will work for these and 1.9.2.

    What we've been talking about is changing the way Relationship objects are stored, and inserting them into a RelationshipSet object instead of a Hash. A RelationshipSet should have similar semantics to PropertySet, except not inherit from Array (which was a mistake in PropertySet we need to fix in order for DM to work with rbx 1.1, which is a separate issue altogether). It should preserve the order of the relationships in the exact same order as they were defined in the model. When finalize walks them, it should visit them in the order defined, so that the inferred FK(s) are created in the exact same order every time.

    Instead of inheriting from Array, the way that PropertySet does, we were going to make a base class called SubjectSet, have it inherit from an OrderedSet (a good one w/specs can be found in snusnu's do-schema), and then make a RelationshipSet that inherits from that. We'll change PropertySet to also inherit from SubjectSet; but common functionality will be pushed up to it. (BTW: in parts of DM internals, when we refer to something that can be either a Property and Relationship, we call it a subject, hence the name SubjectSet)

    Now, here's probably where we should discuss things directly instead of via a ticket. If PropertySet and RelationshipSet end up pushing all their functionality to SubjectSet, so they otherwise are identical, I question whether we actually need both types of objects. I mean, maybe it's the right thing to do, or maybe we can store all properties and relationships inside a single object. Then Model#properties can just call Model#subjects as subjects.select { |s| s.kind_of?(Property) }, and Model#relationships can just do subjects.select { |s| s.kind_of?(Relationship) } .. I don't know. Maybe that won't work for performance reasons, or it's not clean, or it introduces bugs. If it's any of those I don't think we should do it. However that's probably something we should talk about -- but I don't think it's a blocker. If it makes sense, then it'll just be another refactoring at the end of this cleanup. I think for the most part everything leading up to this choice are the logical next steps to cleanup relationships.

  • Dan Kubb (dkubb)

    Dan Kubb (dkubb) September 9th, 2010 @ 02:04 PM

    Actually I should correct one thing I mentioned above. I think that SubjectSet should has-a OrderedSet, rather than be an is-a OrderedSet.

    Since we need to specify the behaviour of #[] to accept a Relationship/Property name, it would be an LSP violation since OrderedSet#[] works with an offset.

    Other than that I think the rest of what I said still stands.

  • Xavier Shay

    Xavier Shay September 9th, 2010 @ 03:54 PM

    I don't know enough about the internals to understand most of what you said, I'm going to have to poke around the code.

    For the record, I do see the ticketed behaviour on 1.9.

  • Dan Kubb (dkubb)

    Dan Kubb (dkubb) September 9th, 2010 @ 06:29 PM

    @xshay: I could be totally wrong, but I like to try to make a guess before diving in to give me someplace to start poking. It seems to me that the order relationships are seen in could affect the order FKs are defined in.

    What would be interesting is to put some debug code in the key inference code (in DM::Association::ManyToOne::Relationship#child_key), and then see what order the #child_key methods are called in wish respect to other relationships in a model.

  • Martin Gamsjaeger (snusnu)

    Martin Gamsjaeger (snusnu) October 31st, 2010 @ 10:27 AM

    • State changed from “unconfirmed” to “confirmed”
  • Martin Gamsjaeger (snusnu)

    Martin Gamsjaeger (snusnu) December 10th, 2010 @ 10:41 AM

    • Milestone set to 1.1
    • State changed from “confirmed” to “accepted”
    • Assigned user set to “Martin Gamsjaeger (snusnu)”
    • Milestone order changed from “196269” to “0”

    I've started work on this. I have the OrderedSet class + specs working already. I should add that currently I'm not sure if we will get a unified #subjects collection. While I haven't yet particularly looked at what RelationshipSet will need, I saw quite some property related methods in PropertySet (like #typecast, #property_context, #discriminator etc).

  • Martin Gamsjaeger (snusnu)

    Martin Gamsjaeger (snusnu) December 11th, 2010 @ 12:07 AM

    Here's what I have so far, all specs are green: https://github.com/snusnu/dm-core/tree/ordered_set

    Comments very much appreciated! I'm planning to polish/finish this tomorrow and apply it to master.

  • Martin Gamsjaeger (snusnu)

    Martin Gamsjaeger (snusnu) January 12th, 2011 @ 02:03 PM

    I encourage everyone to try out: https://github.com/snusnu/dm-core/tree/1398 as i believe it fixes this issue and is quite ready to be merged if no further problems occur

  • Xavier Shay

    Xavier Shay January 12th, 2011 @ 03:18 PM

    Doesn't appear to have fixed the script above, I still get [:group_id, :user_id].

  • Martin Gamsjaeger (snusnu)

    Martin Gamsjaeger (snusnu) January 13th, 2011 @ 08:45 AM

    Interesting, and confirmed. I will need to do some more investigation on exactly which code "installs" the properties and thus the key in the join model. In any case, I think with OrderedSet handy now, we're in a much better situation to tackle this.

  • Martin Gamsjaeger (snusnu)

    Martin Gamsjaeger (snusnu) January 13th, 2011 @ 10:46 AM

    Xavier, can you please try again with the same branch but https://github.com/snusnu/dm-core/commit/d853f32dacd4c4c022b5ca60f0... included? The script shows the expected output for me now.

  • Xavier Shay

    Xavier Shay January 13th, 2011 @ 03:23 PM

    Confirmed it works, I'd be happy for you to apply this patch and mark this ticket as resolved. Thanks!

  • Xavier Shay

    Xavier Shay January 13th, 2011 @ 06:19 PM

    Removed deprecation warning from dm-constraints. Not sure the best way to commit this into the main repo without Breaking Shit (tm).


  • Dan Kubb (dkubb)

    Dan Kubb (dkubb) January 13th, 2011 @ 09:31 PM

    • State changed from “accepted” to “resolved”

    (from [e97e9af2021660dc422a035468456ddeadd499fc]) Always access relationships in order of definition

    This refactoring introduces a few new set classes
    that are used to store relationships, properties
    and ultimately also descendant sets.

    # Implements set behavior and # keeps track of insertion order DataMapper::OrderedSet

    # Uses an OrderedSet to keep track of # objects that must respond_to?(:name) DataMapper::SubjectSet "has a" OrderedSet

    SubjectSet forms the base class for the already
    present DataMapper::PropertySet and the newly added

    PropertySet < SubjectSet RelationshipSet < SubjectSet

    This makes sure that common behavior between these
    two classes is shared, and that RelationshipSet now
    remembers the order in which relationships were added
    too. Previously, relationships were stored inside a
    hash, which would loose order on any not ruby-1.9

    By using an ordered set to store the relationships,
    we can now provide consistent foreign key property
    order when automigrating models (or fetching records
    using DataMapper::Model#get).

    This means that it's finally save to always omit an
    explicit property declaration when defining any
    relationship either with #belongs_to or #has.
    Previously, it wasn't save to call Model#get on
    a model with a CPK that consists of FK properties
    (a join table). This was because Model#get relies on the key property order which was undefined
    before this patch because it simply used a Hash

    [#1398 state:resolved] https://github.com/datamapper/dm-core/commit/e97e9af2021660dc422a03...

Please Sign in or create a free account to add a new ticket.

With your very own profile, you can contribute to projects, track your activity, watch tickets, receive and update tickets through your email and much more.

New-ticket Create new ticket

Create your profile

Help contribute to this project by taking a few moments to create your personal profile. Create your profile »

Referenced by