Bad/missing joins on generated SQL (0.08112)

3 messages Options
Embed this post
Permalink
Nilson Santos Figueiredo Junior

Bad/missing joins on generated SQL (0.08112)

Reply Threaded More More options
Print post
Permalink
So, it seems like I'm always prone to this kind of stuff.
Well, anyway, today I moved some code from a server with DBIC 0.08107
to a new server, with the latest DBIC 0.08112 and some joins stopped
working.

Specifically, this query doesn't produce the desired results anymore:

  my $rs = $schema->resultset('FilingPDF')->search(
      { 'me.file_path' => undef, processing => '0', ticker => { '!=', undef } },
      {
          prefetch => { filing => 'company' },
          join     => { filing => { company => 'tickers' } },
          group_by => 'me.id'
      },
  );

Generated SQL on 0.08107 (and also on 0.08102):

<sql>
SELECT me.id, me.filing_id, me.file_path, me.queued_on,
me.processed_on, me.processing, filing.id, filing.company_id,
filing.title, filing.type, filing.date, filing.url,
filing.retrieval_date, filing.file_path, filing.processed_on,
filing.reference_date, filing.reference_quarter, company.id,
company.name, company.short_name, company.company_type,
company.market_type, company.last_update FROM filing_pdf me  JOIN
filing filing ON filing.id = me.filing_id  JOIN company company ON
company.id = filing.company_id LEFT JOIN company_ticker tickers ON
tickers.company_id = company.id WHERE ( ( me.file_path IS NULL AND
processing = ? AND ticker IS NOT NULL ) ) GROUP BY me.id
</sql>

Generated SQL on 0.08112:

<sql>
SELECT me.id, me.filing_id, me.file_path, me.queued_on,
me.processed_on, me.processing, filing.id, filing.company_id,
filing.title, filing.type, filing.date, filing.url,
filing.retrieval_date, filing.file_path, filing.processed_on,
filing.reference_date, filing.reference_quarter, company.id,
company.name, company.short_name, company.company_type,
company.market_type, company.last_update FROM (SELECT me.id,
me.filing_id, me.file_path, me.queued_on, me.processed_on,
me.processing FROM filing_pdf me JOIN filing filing ON filing.id =
me.filing_id JOIN company company ON company.id = filing.company_id
WHERE ( ( me.file_path IS NULL AND processing = ? AND ticker IS NOT
NULL ) ) GROUP BY me.id) me JOIN filing filing ON filing.id =
me.filing_id JOIN company company ON company.id = filing.company_id
WHERE ( ( me.file_path IS NULL AND processing = ? AND ticker IS NOT
NULL ) )
</sql>

The query generated by 0.08112 is not only much more convoluted, with
subselects, but outright wrong, as it does not join the "tickers"
relationship.

The relevant parts of the schema are below (column definitions
omitted, for brevity's sake):

<DBIC Schema>
package Filings::Schema::FilingPDF;
__PACKAGE__->load_components("InflateColumn::DateTime", "Core");
__PACKAGE__->table("filing_pdf");
__PACKAGE__->set_primary_key("id");
__PACKAGE__->belongs_to(
  "filing",
  "Filings::Schema::Filing",
  { id => "filing_id" },
);

package Filings::Schema::Company;
__PACKAGE__->load_components("InflateColumn::DateTime", "Core");
__PACKAGE__->table("company");
__PACKAGE__->set_primary_key("id");
__PACKAGE__->add_unique_constraint("name", ["name"]);
__PACKAGE__->has_many(
  "tickers",
  "Filings::Schema::CompanyTicker",
  { "foreign.company_id" => "self.id" },
);
__PACKAGE__->has_many(
  "filings",
  "Filings::Schema::Filing",
  { "foreign.company_id" => "self.id" },
);

package Filings::Schema::CompanyTicker;
__PACKAGE__->load_components("Core");
__PACKAGE__->table("company_ticker");
__PACKAGE__->set_primary_key("company_id", "ticker");
__PACKAGE__->add_unique_constraint("ticker", ["ticker"]);
__PACKAGE__->belongs_to(
  "company",
  "Filings::Schema::Company",
  { id => "company_id" },
);
</DBIC Schema>

Any suggestions of sane work-arounds?
Is this a bug or a feature?

For now I think I'm downgrading to a previous version which works.
Might as well try to figure out when it stopped working exactly
between 0.08107 and  0.08112 (if not on the latest version).

-Nilson Santos F. Jr.

_______________________________________________
List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
IRC: irc.perl.org#dbix-class
SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
Searchable Archive: http://www.grokbase.com/group/dbix-class@...
Peter Rabbitson-2

Re: Bad/missing joins on generated SQL (0.08112)

Reply Threaded More More options
Print post
Permalink
Nilson Santos Figueiredo Jr. wrote:

> So, it seems like I'm always prone to this kind of stuff.
> Well, anyway, today I moved some code from a server with DBIC 0.08107
> to a new server, with the latest DBIC 0.08112 and some joins stopped
> working.
>
> Specifically, this query doesn't produce the desired results anymore:
>
>   my $rs = $schema->resultset('FilingPDF')->search(
>       { 'me.file_path' => undef, processing => '0', ticker => { '!=', undef } },
>       {
>           prefetch => { filing => 'company' },
>           join     => { filing => { company => 'tickers' } },
>           group_by => 'me.id'
>       },
>   );
>

Before going further qualify your joined column names. A very complex dance
takes place to make sure the correct joins are brought in, and unqualified
column names upset it. It was deemed a reasonable compromise, as if the
joins are wrongly omitted, the query will fail, instead of returning incorrect
results.

As to why the convoluted subquery is necessary - it allows an arbitrarily
complex group_by on the main table while preserving prefetches.

Cheers

_______________________________________________
List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
IRC: irc.perl.org#dbix-class
SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
Searchable Archive: http://www.grokbase.com/group/dbix-class@...
Nilson Santos Figueiredo Junior

Re: Bad/missing joins on generated SQL (0.08112)

Reply Threaded More More options
Print post
Permalink
On Tue, Sep 29, 2009 at 4:25 PM, Peter Rabbitson <[hidden email]> wrote:
> Before going further qualify your joined column names. A very complex dance
> takes place to make sure the correct joins are brought in, and unqualified
> column names upset it. It was deemed a reasonable compromise, as if the
> joins are wrongly omitted, the query will fail, instead of returning incorrect
> results.

Thanks, this fixed it.
I usually don't qualify joined column names and everything worked fine
until now - going forward I'll try to qualify everything to play it
safe going.

I agree that failing is probably better than wrong results.

> As to why the convoluted subquery is necessary - it allows an arbitrarily
> complex group_by on the main table while preserving prefetches.

Well, the new convoluted subquery actually runs twice as fast as the
old one (on MySQL).
So I'm not complaining as long as it works. ;-)

-Nilson Santos F. Jr.

_______________________________________________
List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
IRC: irc.perl.org#dbix-class
SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
Searchable Archive: http://www.grokbase.com/group/dbix-class@...