Update first X rows

7 messages Options
Embed this post
Permalink
Yves Räber

Update first X rows

Reply Threaded More More options
Print post
Permalink
Hello,

I need to update the first X rows of a resultset, and I've tried something like this :

my $to_dispatch = $c->model('myDB::Table')->search($sql_args, { rows => 10 });
$to_dispatch->update( { state => 'TODO' } );

It works, but generate a query for each row.

What I would like to generate is a query that looks like "UPDATE Table SET state = 'TODO' WHERE blablabla LIMIT 10".

It it possible to do this with DBIx::Class ?

Thanks.

Yves.

_______________________________________________
List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
IRC: irc.perl.org#dbix-class
SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
Searchable Archive: http://www.grokbase.com/group/dbix-class@...
Peter Rabbitson-2

Re: Update first X rows

Reply Threaded More More options
Print post
Permalink
On Tue, Jul 14, 2009 at 09:52:12AM +0200, Yves Räber wrote:

> Hello,
>
> I need to update the first X rows of a resultset, and I've tried something like this :
>
> my $to_dispatch = $c->model('myDB::Table')->search($sql_args, { rows => 10 });
> $to_dispatch->update( { state => 'TODO' } );
>
> It works, but generate a query for each row.
>
> What I would like to generate is a query that looks like "UPDATE Table SET state = 'TODO' WHERE blablabla LIMIT 10".
>
> It it possible to do this with DBIx::Class ?

Anything is possible with DBIC, but the real question is - is it possible
with your RDBMS. What I ned to know is:

* Your rdbms type
* What kind of PK is on myDB::Table (single column or multicolumn)
* Are you absolutely sure UPDATE ... LIMIT X actually works, instead of
  LIMIT being simply ignored? Please test with manual queries.

_______________________________________________
List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
IRC: irc.perl.org#dbix-class
SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
Searchable Archive: http://www.grokbase.com/group/dbix-class@...
Yves Räber

Re: Update first X rows

Reply Threaded More More options
Print post
Permalink
In reply to this post by Yves Räber
Hi,

* The RDMS is MySQL (5.0.75 with InnoDB storage engine)
* I have a single column primary key (integer, auto increment)
* It definitely supports UPDATE...LIMIT row_count (http://dev.mysql.com/doc/refman/5.0/en/update.html)
* I've already tested it, it work just fine with manual queries

And sorry but the claim in my previous mail is false : if I use the search and update query with a "rows" argument as above, the rows part it is simply ignored.

my $task_to_dispatch = $c->model('myDB::Tasks')->search($sql_args, { rows => $rows });
my $dispatched_tasks = $tasks_to_dispatch->update( { state => 'TODO' } );

If I check the generated query with DBIC_TRACE, the "LIMIT" part is not generated at all, and all the rows are updated.

If there is a more elegant way to do it without writing my own custom SQL query, it would be great.

And by the way I'm using DBIx::Class version 0.08108


----- Original Message -----
From: "Peter Rabbitson" <[hidden email]>
To: "DBIx::Class user and developer list" <[hidden email]>
Sent: Tuesday, July 14, 2009 1:29:11 PM GMT +01:00 Amsterdam / Berlin / Bern / Rome / Stockholm / Vienna
Subject: Re: [Dbix-class] Update first X rows

On Tue, Jul 14, 2009 at 09:52:12AM +0200, Yves Räber wrote:

> Hello,
>
> I need to update the first X rows of a resultset, and I've tried something like this :
>
> my $to_dispatch = $c->model('myDB::Table')->search($sql_args, { rows => 10 });
> $to_dispatch->update( { state => 'TODO' } );
>
> It works, but generate a query for each row.
>
> What I would like to generate is a query that looks like "UPDATE Table SET state = 'TODO' WHERE blablabla LIMIT 10".
>
> It it possible to do this with DBIx::Class ?

Anything is possible with DBIC, but the real question is - is it possible
with your RDBMS. What I ned to know is:

* Your rdbms type
* What kind of PK is on myDB::Table (single column or multicolumn)
* Are you absolutely sure UPDATE ... LIMIT X actually works, instead of
  LIMIT being simply ignored? Please test with manual queries.

_______________________________________________
List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
IRC: irc.perl.org#dbix-class
SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
Searchable Archive: http://www.grokbase.com/group/dbix-class@...

_______________________________________________
List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
IRC: irc.perl.org#dbix-class
SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
Searchable Archive: http://www.grokbase.com/group/dbix-class@...
Peter Rabbitson-2

Re: Update first X rows

Reply Threaded More More options
Print post
Permalink
On Tue, Jul 14, 2009 at 03:20:56PM +0200, Yves Räber wrote:
> Hi,
>
> * The RDMS is MySQL (5.0.75 with InnoDB storage engine)
> * I have a single column primary key (integer, auto increment)
> * It definitely supports UPDATE...LIMIT row_count (http://dev.mysql.com/doc/refman/5.0/en/update.html)
> * I've already tested it, it work just fine with manual queries

Good to know.

> And sorry but the claim in my previous mail is false : if I use the search and update query with a "rows" argument as above, the rows part it is simply ignored.
>
> my $task_to_dispatch = $c->model('myDB::Tasks')->search($sql_args, { rows => $rows });
> my $dispatched_tasks = $tasks_to_dispatch->update( { state => 'TODO' } );
>
> If I check the generated query with DBIC_TRACE, the "LIMIT" part is not generated at all, and all the rows are updated.

You didn't pay enough atention. What happens is the code at line 1298
in DBIC::Storage::DBI::_per_row_update_delete. Pay attenion to the initial
SELECT that gathers the PKs to modify - it does contain the limit. In other
words the code behaves correctly, it just uses the slowest and most
foolproof method to accomplish the job.

> If there is a more elegant way to do it without writing my own custom SQL query, it would be great.

Yes there is, but it will require some refactoring. When I wrote the code
in question the consensus was that LIMIT is not generaly supported. Now
that I see I was wrong, I need to provide more/better hooks for the DBI
drivers to override as seen fit. I won't however be able to do this
properly until mid-august. If someone else wants to pick this up and
JFDI I'd be glad to review it sooner. Some random thoughts:

- The dispatcher in DBIC::Resultset::_rs_update_delete() needs to be more
flexible and more importantly - needs to absorb more of the logic, instead
of feeding resultset objects to the storage driver (we have the tools to
do this now). The final goal is to never have a $rs object show up in any
Storage::* code.

- The "limit only" default behavior *still* needs to be a subquery (or in
the case of unworthy databases - a _per_row_u/d, as limit at times is a
nightmare (see mssql). The idea is to be extremely conservative and build
up overrides in places where we absolutely know things will work (i.e.
an override of the appropriate hook in DBI::mysql to do a straight $op
when just a limit is present). I rather delete rows slowly off an $rs,
instead of quickly deleting the wrong ones :)

- We need better naming for the hooks. I.e. ::DBI::_multipk_update_delete
doesn't make sense in this context anymore.

Cheers

_______________________________________________
List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
IRC: irc.perl.org#dbix-class
SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
Searchable Archive: http://www.grokbase.com/group/dbix-class@...
Yves Räber

Re: Update first X rows

Reply Threaded More More options
Print post
Permalink
In reply to this post by Yves Räber
Thanks for your explanation. But things are still not clear for me.

If I create a resulset with a "rows" argument and then

1/ Get all  the rows with a ->all method : It generate a "SELECT ... LIMIT rowcount" that's perfect
2/ Update the rows with a ->update( { whatever => 1 }) : It generates a "UPDATE ..." but ignores the limit

The slow and foolpoof method would be to update the rows one by one and stop after X elements, but it's not what it is doing.

The workaround I use is to get the elements with "->next" and then update them one by one :

while(my $row = $task_to_dispatch->next) {
  $row->update({ state => 'TODO' });
}

Now I don't understand if this is by design, a missing feature or a bug. And unfortunately I don't feel competent enough to tweak in DBIC's code but I'd be happy to help with testing.

Yves.


----- Original Message -----
From: "Peter Rabbitson" <[hidden email]>
To: "DBIx::Class user and developer list" <[hidden email]>
Sent: Wednesday, July 15, 2009 1:48:15 AM GMT +01:00 Amsterdam / Berlin / Bern / Rome / Stockholm / Vienna
Subject: Re: [Dbix-class] Update first X rows

On Tue, Jul 14, 2009 at 03:20:56PM +0200, Yves Räber wrote:
> Hi,
>
> * The RDMS is MySQL (5.0.75 with InnoDB storage engine)
> * I have a single column primary key (integer, auto increment)
> * It definitely supports UPDATE...LIMIT row_count (http://dev.mysql.com/doc/refman/5.0/en/update.html)
> * I've already tested it, it work just fine with manual queries

Good to know.

> And sorry but the claim in my previous mail is false : if I use the search and update query with a "rows" argument as above, the rows part it is simply ignored.
>
> my $task_to_dispatch = $c->model('myDB::Tasks')->search($sql_args, { rows => $rows });
> my $dispatched_tasks = $tasks_to_dispatch->update( { state => 'TODO' } );
>
> If I check the generated query with DBIC_TRACE, the "LIMIT" part is not generated at all, and all the rows are updated.

You didn't pay enough atention. What happens is the code at line 1298
in DBIC::Storage::DBI::_per_row_update_delete. Pay attenion to the initial
SELECT that gathers the PKs to modify - it does contain the limit. In other
words the code behaves correctly, it just uses the slowest and most
foolproof method to accomplish the job.

> If there is a more elegant way to do it without writing my own custom SQL query, it would be great.

Yes there is, but it will require some refactoring. When I wrote the code
in question the consensus was that LIMIT is not generaly supported. Now
that I see I was wrong, I need to provide more/better hooks for the DBI
drivers to override as seen fit. I won't however be able to do this
properly until mid-august. If someone else wants to pick this up and
JFDI I'd be glad to review it sooner. Some random thoughts:

- The dispatcher in DBIC::Resultset::_rs_update_delete() needs to be more
flexible and more importantly - needs to absorb more of the logic, instead
of feeding resultset objects to the storage driver (we have the tools to
do this now). The final goal is to never have a $rs object show up in any
Storage::* code.

- The "limit only" default behavior *still* needs to be a subquery (or in
the case of unworthy databases - a _per_row_u/d, as limit at times is a
nightmare (see mssql). The idea is to be extremely conservative and build
up overrides in places where we absolutely know things will work (i.e.
an override of the appropriate hook in DBI::mysql to do a straight $op
when just a limit is present). I rather delete rows slowly off an $rs,
instead of quickly deleting the wrong ones :)

- We need better naming for the hooks. I.e. ::DBI::_multipk_update_delete
doesn't make sense in this context anymore.

Cheers

_______________________________________________
List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
IRC: irc.perl.org#dbix-class
SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
Searchable Archive: http://www.grokbase.com/group/dbix-class@...

_______________________________________________
List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
IRC: irc.perl.org#dbix-class
SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
Searchable Archive: http://www.grokbase.com/group/dbix-class@...
Peter Rabbitson-2

Re: Update first X rows

Reply Threaded More More options
Print post
Permalink
On Thu, Jul 16, 2009 at 01:53:39PM +0200, Yves Räber wrote:
> Thanks for your explanation. But things are still not clear for me.
>
> If I create a resulset with a "rows" argument and then
>
> 1/ Get all  the rows with a ->all method : It generate a "SELECT ... LIMIT rowcount" that's perfect
> 2/ Update the rows with a ->update( { whatever => 1 }) : It generates a "UPDATE ..." but ignores the limit
>
> The slow and foolpoof method would be to update the rows one by one and stop after X elements, but it's not what it is doing.
>

Sigh. It really helps if you read what people point you to (I am referring
to the function I mentioned in previous mails). Once gain - the *default*
way to do $rs->update will _never_ add a limit to the query, as limit
is simply not portable (see SQL::Abstract::Limit). The "slow and foolproof
method" we use is to:

1) start a transaction
2) *SELECT* (not update) all rows that we would want to update
3) iterate over the result and update rows one by one
4) wrap up transaction

Guess which one of the points contains the actual limit, and read the
damned trace.

To summarize - everything works correctly, this is not a bug, it is just
inefficient. I already outlined what needs to be done to make it happen,
and it will happen when someone gets the time to do it.

_______________________________________________
List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
IRC: irc.perl.org#dbix-class
SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
Searchable Archive: http://www.grokbase.com/group/dbix-class@...
Yves Räber

Re: Update first X rows

Reply Threaded More More options
Print post
Permalink
Thank you.

Now it's very clear.

----- Original Message -----
From: "Peter Rabbitson" <[hidden email]>
To: "DBIx::Class user and developer list" <[hidden email]>
Sent: Thursday, July 16, 2009 3:31:23 PM GMT +01:00 Amsterdam / Berlin / Bern / Rome / Stockholm / Vienna
Subject: Re: [Dbix-class] Update first X rows

On Thu, Jul 16, 2009 at 01:53:39PM +0200, Yves Räber wrote:
> Thanks for your explanation. But things are still not clear for me.
>
> If I create a resulset with a "rows" argument and then
>
> 1/ Get all  the rows with a ->all method : It generate a "SELECT ... LIMIT rowcount" that's perfect
> 2/ Update the rows with a ->update( { whatever => 1 }) : It generates a "UPDATE ..." but ignores the limit
>
> The slow and foolpoof method would be to update the rows one by one and stop after X elements, but it's not what it is doing.
>

Sigh. It really helps if you read what people point you to (I am referring
to the function I mentioned in previous mails). Once gain - the *default*
way to do $rs->update will _never_ add a limit to the query, as limit
is simply not portable (see SQL::Abstract::Limit). The "slow and foolproof
method" we use is to:

1) start a transaction
2) *SELECT* (not update) all rows that we would want to update
3) iterate over the result and update rows one by one
4) wrap up transaction

Guess which one of the points contains the actual limit, and read the
damned trace.

To summarize - everything works correctly, this is not a bug, it is just
inefficient. I already outlined what needs to be done to make it happen,
and it will happen when someone gets the time to do it.

_______________________________________________
List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
IRC: irc.perl.org#dbix-class
SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
Searchable Archive: http://www.grokbase.com/group/dbix-class@...

_______________________________________________
List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
IRC: irc.perl.org#dbix-class
SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
Searchable Archive: http://www.grokbase.com/group/dbix-class@...