#357 ✓resolved
Ana Nelson

lazy loading problem when id not sorted

Reported by Ana Nelson | June 7th, 2008 @ 08:25 AM

I have a Project class which "has n" notes. Note has a description field of type Text, so description is loaded lazily by default.

When I do

Project.get(4).notes

or

Note.all(:project_id => 4)

I get back notes 5 and 6, which is correct, but BOTH notes have the description from note 6. Adding :lazy => false to the descriptions field solves the problem.

Sat, 07 Jun 2008 12:48:13 GMT ~ debug ~ SELECT "id", "created_at", "updated_at", "project_id" FROM "notes" WHERE "project_id" = 4

Sat, 07 Jun 2008 12:48:13 GMT ~ debug ~ SELECT "id", "created_at", "updated_at", "project_id", "description" FROM "notes" WHERE "project_id" = 4 AND "id" IN (6, 5)

I notice that in my postgres database, the id column is not kept sorted. So, the notes are actually in the order 1,3,2,4,6,5

The first query above returns the notes in the order 6,5. The second query returns the notes in the order 5,6 although neither query says anything about the ordering. Changing the second query to be "id" IN (5, 6) still returns the notes in the order 5,6.

I suspect that somewhere in lazy loading there is an implicit assumption that these two queries return things in the same order. But, here we aren't sorting by anything.

My notes.id column is defined in datamapper by:

property :id, Integer, :serial => true

The postgres table description is:

=# \d+ notes

Table "public.notes"

Column | Type | Modifiers | Description

-------------+-----------------------------+----------------------------------------------------+-------------

id | integer | not null default nextval('notes_id_seq'::regclass) |

created_at | timestamp without time zone | |

updated_at | timestamp without time zone | |

project_id | integer | |

description | text | |

Indexes:

"notes_pkey" PRIMARY KEY, btree (id)

Has OIDs: no

Comments and changes to this ticket

  • Dan Kubb (dkubb)

    Dan Kubb (dkubb) June 7th, 2008 @ 01:05 PM

    Within Collection#reload we perform the lazy-load and then replace the collection internally with whatever new records were returned. The reason we replace is that since the original retrieval the records could've been altered (to change the ordering) or deleted, and I wanted the Collection to reflect that. I think you'd expect reload() whether it's called on a single Resource or a Collection to give you an up-to-date snapshot of whatever is in the data source.

    I'm wondering tho, when there's no explicit :order defined in the query, we shouldn't actually replace the Array in memory. The best thing we could do is remove any entries from the original Array that were not present after the reload, and update the resources in the IdentityMap, which will automatically update the attributes of the resources in the collection. That way the random ordering of the original Array will be preserved, and we'll still have an up-to-date snapshot of the data.

  • Dan Kubb (dkubb)

    Dan Kubb (dkubb) June 7th, 2008 @ 01:21 PM

    • Milestone cleared.
    • State changed from “new” to “open”

    There's a second option for dealing with this that I've been thinking about lately.

    Aside from debugging, how often would you be pulling back records from a database without any explicit order? I tried to think about this and in every app I could remember I was ordering on something -- whether it be a creation date, or someone's name, a price or even just a numeric id -- I was always ordering on at least one property. Anytime I displayed a list of something it had to be in a non-random order.

    With this in mind, what about defining a default sort order for all queries that do not include unique/key properties?

    I know specifying the order isn't as efficient as just pulling back records in the default order, but if every query in production is going to have an order anyway, then why not default it so it's consistent? People are far more likely to notice if the ordering works in a given context if it is always consistent anyway.

    This also would make Model.first more consistent. When it's used with non-unique properties you can get random results depending on the order of the underlying data. This also makes Model.last possible too -- if every query has a default :order, then we can simply reverse the order and use Model.first to pull back the last result.

    IMHO the bug with Model.first returning inconsistent results is enough of a reason to adopt an explicit default sort order for all queries, but it would also solve the Collection ordering problem as well as make Model.last possible.

    We could define the global default to use the Model.keys, and then allow each Model to provide a default_order method that returns an Array with the order to be used. Query could be modified to use this when no order parameter was passed in.

  • Ana Nelson

    Ana Nelson June 7th, 2008 @ 03:05 PM

    Yes, I think having a default order would be a good idea, and it would also help with database transparency. I sort of expected the database to sort consistently by ID or creation order if I didn't give any explicit ordering, MySQL always seemed to store records in the same order but Postgres apparently doesn't. I know this was a rather sloppy assumption on my part, but it's an easy one to make if it's the behaviour you've always observed. (I've just started using postgres)

    Maybe it would be possible to give people an opt-out if, for some reason, they really wanted the default database ordering? Then they would be explicitly accepting responsibility for the randomness of Model.first etc ... although in that case the lazy loading issue would still need to be fixed some other way (or ban lazy loading in that situation).

    I can't think of a time I've ever had to display results without some sort of ordering either.

  • Sam Smoot

    Sam Smoot June 7th, 2008 @ 03:26 PM

    I was against a default order when I first read the suggestion. But... I think you're right. No legitimate reason not to that I can think of.

    A bit of trivia: Postgres and MySQL both return in-order results. It's just that MySQL orders on the id, and Postgres (I'm guessing) orders on the oid. You will get consistent results. And if you cluster a table in Postgres, you can get it to return a default order any way you want. The problem is that unlike Oracle, MSSQL, and the other enterprisey databases, it doesn't support an enforced cluster. So you cluster on the created_at. Great. Next time a record is inserted, that one record is out of place. BTW, when I say cluster, I'm referring to the physical order of the records on disk. Which is why it always comes back in an order; just not the one you might expect.

    Anyways, yeah, long story short, a default-order sounds like a good idea.

    *On the other hand*: The problem we have here is deeper than that. I do think it's time to reconsider the whole #replace strategy. Yes, you'll keep up with deleted records. True. But, it's that one little bit of code that spoils the party.

    And to be honest, if we're looking at this as "objects first", it doesn't really make sense to destructively manipulate the in-memory set of objects just because the store is updated. That should only happen if you explicitly reload the entire set.

    -2c

  • Dan Kubb (dkubb)

    Dan Kubb (dkubb) June 9th, 2008 @ 12:21 AM

    • Assigned user changed from “Sam Smoot” to “Dan Kubb (dkubb)”

    I added Model.default_order method the other day for some work I was doing with Collection#last. As of right now it just aliases Model.key, which I think is a decent default -- when it doubt sort by the primary keys.

    People would be able to define their own default_order method in their models if they wished, and just have it return an Array of Symbol/Property/Query::Direction objects.

    I could easily update this so that when the Query object is created (and no :order was specified) to just use Model.default_order.

    This will neatly make first() and last() have predictable results given identical data between different data stores. Regardless of what we decide with Collection#reload I think this should be done.

    As far as the deeper issue goes, I'm fairly confident the current behavior with replacement is correct. When you call Resource#reload you get a snapshot of how that Resource looks in the data store at that given point in time. I think it should be the same with Collection#reload.

    I think of a Collection as almost like a Resource, where it's attributes are the entries. Asking an object like that to reload itself should cause it to restore the state for each object, and (if necessary) re-order itself based on updated data in the data store.

  • Troy K

    Troy K June 16th, 2008 @ 04:59 PM

    I'm having a similar issue, only I'm sorting on a date field (same as above, I have note records that lazy load the text field. --note: The note resource uses STI)

    >> cn = ContactNote.all(:type_id=>5,:order=>[:occured_at.desc]).each { |c| puts c.id }
    4
    5
    3
    1
    2
    
    >> cn.each { |c| puts c.id }
    1
    2
    3
    4
    5
    
    

    Watching mysql on the backend shows that once dm goes after the lazy fields, the order of the set is completely altered.

  • Dan Kubb (dkubb)

    Dan Kubb (dkubb) June 16th, 2008 @ 06:36 PM

    Troy the problem you are describing was an issue with DM a week or two ago, but should have been fixed recently. Are you seeing this with edge DM (the one in git), or with the gem version of DM?

    (BTW since it's my understanding that this problem is resolved, the only reason this ticket is still open is to remind me to add the default ordering into the Query system in cases where an order wasn't explicitly specified)

  • Troy K

    Troy K June 16th, 2008 @ 07:07 PM

    Dan,

    My bad - as of late I've been using the cool stuff at http://datamapper.org/articles/s... (which you made right? -- great work, every project should make it this easy). But it looks like doing sake dm:update and then sake dm:install does not overwrite the old versions, so my project was still using 9.0.1 even though I had 9.2 in the gems dir.

    I ran sake dm:gems:wipe and re-installed and now the ordering is preserved with lazy, so yes it is fixed and runs great!

    thanks

    --troy

  • Dan Kubb (dkubb)

    Dan Kubb (dkubb) June 16th, 2008 @ 10:11 PM

    • State changed from “open” to “resolved”

    This ticket was resolved with the following commit:

    http://github.com/sam/dm-core/co...

Please Sign in or create a free account to add a new ticket.

With your very own profile, you can contribute to projects, track your activity, watch tickets, receive and update tickets through your email and much more.

New-ticket Create new ticket

Create your profile

Help contribute to this project by taking a few moments to create your personal profile. Create your profile »

People watching this ticket

Pages