Complex join problem

11 messages Options
Embed this post
Permalink
Chris Cole

Complex join problem

Reply Threaded More More options
Print post
Permalink
Hi all,

I'm managing to 'cascade' joins along relation's relationships, but only
so far. Is there a limit?

The table definitions are at the bottom, but I'm trying to convert this
SQL into DBIx:
SELECT
db_match.read_id,db_match.seq_id,match_start,match_end,query_start,mismatch,abundance
FROM db_match
JOIN seq_reads ON db_match.read_id = seq_reads.read_id
JOIN reads2expt ON seq_reads.read_id = reads2expt.read_id
WHERE expt_id = 2

The code have is:
my @rs = $schema->resultset('DbMatch')->search(
   {
      expt_id => 2,
   },
   {
      join => {         # join on seq_reads -> reads2expt tables
         read_id => {
           reads2expts => 'read_id'
         }
      },
   }
);

This works, in that I get the right rows back, but I cannot retrieve any
specific column values from the reads2expt table (e.g. 'abundance').
Neither with a get_column() method nor directly. I've tried things like
the following with no joy:

$rs[0]->abundance;
$rs[0]->read_id->reads2expts->read_id->get_column('abundance');

I can get get the data from the other tables no problem, it's just the
reads2expt one. Anyone know where I'm going wrong?
Any help is most appreciated.
Chris

Table classes:
__PACKAGE__->load_components("Core");
__PACKAGE__->table("db_match");
__PACKAGE__->add_columns(
  "seq_id",
  { data_type => "INT", default_value => "", is_nullable => 0, size => 10 },
  "read_id",
  { data_type => "BIGINT", default_value => "", is_nullable => 0, size
=> 20 },
  "search_id",
  { data_type => "SMALLINT", default_value => "", is_nullable => 0, size
=> 5 },
  "match_start",
  { data_type => "SMALLINT", default_value => "", is_nullable => 0, size
=> 5 },
  "match_end",
  { data_type => "SMALLINT", default_value => "", is_nullable => 0, size
=> 5 },
  "query_start",
  { data_type => "SMALLINT", default_value => "", is_nullable => 0, size
=> 5 },
  "mismatch",
  { data_type => "TINYINT", default_value => "", is_nullable => 0, size
=> 3 },
);
__PACKAGE__->belongs_to("seq_id", "DB::Schema::DbSequences", { seq_id =>
"seq_id" });
__PACKAGE__->belongs_to("read_id", "DB::Schema::SeqReads", { read_id =>
"read_id" });
__PACKAGE__->belongs_to(
  "search_id",
  "DB::Schema::SearchParams",
  { search_id => "search_id" },
);

__PACKAGE__->load_components("Core");
__PACKAGE__->table("seq_reads");
__PACKAGE__->add_columns(
  "read_id",
  { data_type => "BIGINT", default_value => undef, is_nullable => 0,
size => 20 },
  "length",
  { data_type => "SMALLINT", default_value => "", is_nullable => 0, size
=> 5 },
  "seq",
  { data_type => "VARCHAR", default_value => "", is_nullable => 0, size
=> 255 },
);
__PACKAGE__->set_primary_key("read_id");
__PACKAGE__->has_many(
  "db_matches",
  "DB::Schema::DbMatch",
  { "foreign.read_id" => "self.read_id" },
);
__PACKAGE__->has_many(
  "genome_matches",
  "DB::Schema::GenomeMatch",
  { "foreign.read_id" => "self.read_id" },
);
__PACKAGE__->has_many(
  "reads2expts",
  "DB::Schema::Reads2expt",
  { "foreign.read_id" => "self.read_id" },
);

__PACKAGE__->load_components("Core");
__PACKAGE__->table("reads2expt");
__PACKAGE__->add_columns(
  "expt_id",
  { data_type => "SMALLINT", default_value => "", is_nullable => 0, size
=> 5 },
  "read_id",
  { data_type => "BIGINT", default_value => "", is_nullable => 0, size
=> 20 },
  "abundance",
  { data_type => "INT", default_value => "", is_nullable => 0, size => 10 },
  "read_name",
  { data_type => "VARCHAR", default_value => "", is_nullable => 0, size
=> 255 },
);
__PACKAGE__->belongs_to("expt_id", "DB::Schema::Experiment", { expt_id
=> "expt_id" });
__PACKAGE__->belongs_to("read_id", "DB::Schema::SeqReads", { read_id =>
"read_id" });


_______________________________________________
List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
IRC: irc.perl.org#dbix-class
SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
Searchable Archive: http://www.grokbase.com/group/dbix-class@...
Peter Rabbitson-2

Re: Complex join problem

Reply Threaded More More options
Print post
Permalink
Chris Cole wrote:

> Hi all,
>
> I'm managing to 'cascade' joins along relation's relationships, but only
> so far. Is there a limit?
>
> The table definitions are at the bottom, but I'm trying to convert this
> SQL into DBIx:
> SELECT
> db_match.read_id,db_match.seq_id,match_start,match_end,query_start,mismatch,abundance
> FROM db_match
> JOIN seq_reads ON db_match.read_id = seq_reads.read_id
> JOIN reads2expt ON seq_reads.read_id = reads2expt.read_id
> WHERE expt_id = 2
>
> The code have is:
> my @rs = $schema->resultset('DbMatch')->search(
>    {
>       expt_id => 2,
>    },
>    {
>       join => {         # join on seq_reads -> reads2expt tables
>          read_id => {
>            reads2expts => 'read_id'
>          }
>       },
>    }
> );
>
> This works, in that I get the right rows back, but I cannot retrieve any
> specific column values from the reads2expt table (e.g. 'abundance').
> Neither with a get_column() method nor directly. I've tried things like
> the following with no joy:
>
> $rs[0]->abundance;
> $rs[0]->read_id->reads2expts->read_id->get_column('abundance');
>
> I can get get the data from the other tables no problem, it's just the
> reads2expt one. Anyone know where I'm going wrong?
> Any help is most appreciated.

Your SeqReads has_many Reads2expt, thus the call:

read_id->reads2expts returns a ResultSet, not a Row. Now the real
question is - which 'abundance' do you want? The one from the read_id
of the first reads2expts? Or the last one? Or all of them?

Possibilities would be:

my $col = $rs[0]->read_id->reads2expts->first->read_id->get_column('abundance');
my @cols = $rs[0]->read_id->reads2expts->search_related ('read_id')->get_column('abundance')->all;

You get the idea

Also do not call a relationship 'read_id' - call it 'read'

Cheers

_______________________________________________
List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
IRC: irc.perl.org#dbix-class
SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
Searchable Archive: http://www.grokbase.com/group/dbix-class@...
Chris Cole

Re: Complex join problem

Reply Threaded More More options
Print post
Permalink
Peter Rabbitson wrote:

> Chris Cole wrote:
>> Hi all,
>>
>> I'm managing to 'cascade' joins along relation's relationships, but only
>> so far. Is there a limit?
>>
>> The table definitions are at the bottom, but I'm trying to convert this
>> SQL into DBIx:
>> SELECT
>> db_match.read_id,db_match.seq_id,match_start,match_end,query_start,mismatch,abundance
>> FROM db_match
>> JOIN seq_reads ON db_match.read_id = seq_reads.read_id
>> JOIN reads2expt ON seq_reads.read_id = reads2expt.read_id
>> WHERE expt_id = 2
>>
>> The code have is:
>> my @rs = $schema->resultset('DbMatch')->search(
>>    {
>>       expt_id => 2,
>>    },
>>    {
>>       join => {         # join on seq_reads -> reads2expt tables
>>          read_id => {
>>            reads2expts => 'read_id'
>>          }
>>       },
>>    }
>> );
>>
>> This works, in that I get the right rows back, but I cannot retrieve any
>> specific column values from the reads2expt table (e.g. 'abundance').
>> Neither with a get_column() method nor directly. I've tried things like
>> the following with no joy:
>>
>> $rs[0]->abundance;
>> $rs[0]->read_id->reads2expts->read_id->get_column('abundance');
>>
>> I can get get the data from the other tables no problem, it's just the
>> reads2expt one. Anyone know where I'm going wrong?
>> Any help is most appreciated.
>
> Your SeqReads has_many Reads2expt, thus the call:
>
> read_id->reads2expts returns a ResultSet, not a Row.

Ah, OK.

> Now the real
> question is - which 'abundance' do you want? The one from the read_id
> of the first reads2expts? Or the last one? Or all of them?

The one that's associated with expt_id = 2

> Possibilities would be:
>
> my $col = $rs[0]->read_id->reads2expts->first->read_id->get_column('abundance');
> my @cols = $rs[0]->read_id->reads2expts->search_related ('read_id')->get_column('abundance')->all;
>
> You get the idea

I do. Got to it with this:
$rs[0]->read_id->reads2expts->search_related ('read_id', {expt_id =>
2})->get_column('abundance')->first

What I don't understand is why the expt_id => 2 in the search() doesn't
limit the join on the reads2expt table? It needs me to do the filtering
twice. The SQL returns the right thing, so why doesn't DBIC?

> Also do not call a relationship 'read_id' - call it 'read'

Yeah the names are confusing, but these were generated automatically by
DBIC::Schema::Loader. If I change them I can't regenerate the classes :(

Thanks for the help.

_______________________________________________
List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
IRC: irc.perl.org#dbix-class
SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
Searchable Archive: http://www.grokbase.com/group/dbix-class@...
Chris Cole

Re: Complex join problem

Reply Threaded More More options
Print post
Permalink
Chris Cole wrote:

> Peter Rabbitson wrote:
>> Chris Cole wrote:
>>> Hi all,
>>>
>>> I'm managing to 'cascade' joins along relation's relationships, but only
>>> so far. Is there a limit?
>>>
>>> The table definitions are at the bottom, but I'm trying to convert this
>>> SQL into DBIx:
>>> SELECT
>>> db_match.read_id,db_match.seq_id,match_start,match_end,query_start,mismatch,abundance
>>> FROM db_match
>>> JOIN seq_reads ON db_match.read_id = seq_reads.read_id
>>> JOIN reads2expt ON seq_reads.read_id = reads2expt.read_id
>>> WHERE expt_id = 2
>>>
>>> The code have is:
>>> my @rs = $schema->resultset('DbMatch')->search(
>>>    {
>>>       expt_id => 2,
>>>    },
>>>    {
>>>       join => {         # join on seq_reads -> reads2expt tables
>>>          read_id => {
>>>            reads2expts => 'read_id'
>>>          }
>>>       },
>>>    }
>>> );
>>>
>>> This works, in that I get the right rows back, but I cannot retrieve any
>>> specific column values from the reads2expt table (e.g. 'abundance').
>>> Neither with a get_column() method nor directly. I've tried things like
>>> the following with no joy:
>>>
>>> $rs[0]->abundance;
>>> $rs[0]->read_id->reads2expts->read_id->get_column('abundance');
>>>
>>> I can get get the data from the other tables no problem, it's just the
>>> reads2expt one. Anyone know where I'm going wrong?
>>> Any help is most appreciated.
>> Your SeqReads has_many Reads2expt, thus the call:
>>
>> read_id->reads2expts returns a ResultSet, not a Row.
>
> Ah, OK.
>
>> Now the real
>> question is - which 'abundance' do you want? The one from the read_id
>> of the first reads2expts? Or the last one? Or all of them?
>
> The one that's associated with expt_id = 2
>
>> Possibilities would be:
>>
>> my $col = $rs[0]->read_id->reads2expts->first->read_id->get_column('abundance');
>> my @cols = $rs[0]->read_id->reads2expts->search_related ('read_id')->get_column('abundance')->all;
>>
>> You get the idea
>
> I do. Got to it with this:
> $rs[0]->read_id->reads2expts->search_related ('read_id', {expt_id =>
> 2})->get_column('abundance')->first
>
> What I don't understand is why the expt_id => 2 in the search() doesn't
> limit the join on the reads2expt table? It needs me to do the filtering
> twice. The SQL returns the right thing, so why doesn't DBIC?

I also note that the search_related() step is very detrimental on speed.
   The query takes several minutes, whereas coding it in straight perl
DBI gives the same result in a few seconds.



_______________________________________________
List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
IRC: irc.perl.org#dbix-class
SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
Searchable Archive: http://www.grokbase.com/group/dbix-class@...
Peter Rabbitson-2

Re: Complex join problem

Reply Threaded More More options
Print post
Permalink
Chris Cole wrote:

> Chris Cole wrote:
>> Peter Rabbitson wrote:
>>> Chris Cole wrote:
>>>> Hi all,
>>>>
>>>> I'm managing to 'cascade' joins along relation's relationships, but only
>>>> so far. Is there a limit?
>>>>
>>>> The table definitions are at the bottom, but I'm trying to convert this
>>>> SQL into DBIx:
>>>> SELECT
>>>> db_match.read_id,db_match.seq_id,match_start,match_end,query_start,mismatch,abundance
>>>> FROM db_match
>>>> JOIN seq_reads ON db_match.read_id = seq_reads.read_id
>>>> JOIN reads2expt ON seq_reads.read_id = reads2expt.read_id
>>>> WHERE expt_id = 2
>>>>
>>>> The code have is:
>>>> my @rs = $schema->resultset('DbMatch')->search(
>>>>    {
>>>>       expt_id => 2,
>>>>    },
>>>>    {
>>>>       join => {         # join on seq_reads -> reads2expt tables
>>>>          read_id => {
>>>>            reads2expts => 'read_id'
>>>>          }
>>>>       },
>>>>    }
>>>> );
>>>>
>>>> This works, in that I get the right rows back, but I cannot retrieve any
>>>> specific column values from the reads2expt table (e.g. 'abundance').
>>>> Neither with a get_column() method nor directly. I've tried things like
>>>> the following with no joy:
>>>>
>>>> $rs[0]->abundance;
>>>> $rs[0]->read_id->reads2expts->read_id->get_column('abundance');
>>>>
>>>> I can get get the data from the other tables no problem, it's just the
>>>> reads2expt one. Anyone know where I'm going wrong?
>>>> Any help is most appreciated.
>>> Your SeqReads has_many Reads2expt, thus the call:
>>>
>>> read_id->reads2expts returns a ResultSet, not a Row.
>> Ah, OK.
>>
>>> Now the real
>>> question is - which 'abundance' do you want? The one from the read_id
>>> of the first reads2expts? Or the last one? Or all of them?
>> The one that's associated with expt_id = 2
>>
>>> Possibilities would be:
>>>
>>> my $col = $rs[0]->read_id->reads2expts->first->read_id->get_column('abundance');
>>> my @cols = $rs[0]->read_id->reads2expts->search_related ('read_id')->get_column('abundance')->all;
>>>
>>> You get the idea
>> I do. Got to it with this:
>> $rs[0]->read_id->reads2expts->search_related ('read_id', {expt_id =>
>> 2})->get_column('abundance')->first
>>
>> What I don't understand is why the expt_id => 2 in the search() doesn't
>> limit the join on the reads2expt table? It needs me to do the filtering
>> twice. The SQL returns the right thing, so why doesn't DBIC?
>
> I also note that the search_related() step is very detrimental on speed.
>    The query takes several minutes, whereas coding it in straight perl
> DBI gives the same result in a few seconds.
>

Which means you're doing something wrong. Show SQL being executed
(DBIC_TRACE = 1)

_______________________________________________
List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
IRC: irc.perl.org#dbix-class
SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
Searchable Archive: http://www.grokbase.com/group/dbix-class@...
Peter Rabbitson-2

Re: Complex join problem

Reply Threaded More More options
Print post
Permalink
In reply to this post by Chris Cole
Chris Cole wrote:

> Peter Rabbitson wrote:
>> Chris Cole wrote:
>>> Hi all,
>>>
>>> I'm managing to 'cascade' joins along relation's relationships, but only
>>> so far. Is there a limit?
>>>
>>> The table definitions are at the bottom, but I'm trying to convert this
>>> SQL into DBIx:
>>> SELECT
>>> db_match.read_id,db_match.seq_id,match_start,match_end,query_start,mismatch,abundance
>>> FROM db_match
>>> JOIN seq_reads ON db_match.read_id = seq_reads.read_id
>>> JOIN reads2expt ON seq_reads.read_id = reads2expt.read_id
>>> WHERE expt_id = 2
>>>
>>> The code have is:
>>> my @rs = $schema->resultset('DbMatch')->search(
>>>    {
>>>       expt_id => 2,
>>>    },
>>>    {
>>>       join => {         # join on seq_reads -> reads2expt tables
>>>          read_id => {
>>>            reads2expts => 'read_id'
>>>          }
>>>       },
>>>    }
>>> );
>>>
>>> This works, in that I get the right rows back, but I cannot retrieve any
>>> specific column values from the reads2expt table (e.g. 'abundance').
>>> Neither with a get_column() method nor directly. I've tried things like
>>> the following with no joy:
>>>
>>> $rs[0]->abundance;
>>> $rs[0]->read_id->reads2expts->read_id->get_column('abundance');
>>>
>>> I can get get the data from the other tables no problem, it's just the
>>> reads2expt one. Anyone know where I'm going wrong?
>>> Any help is most appreciated.
>> Your SeqReads has_many Reads2expt, thus the call:
>>
>> read_id->reads2expts returns a ResultSet, not a Row.
>
> Ah, OK.
>
>> Now the real
>> question is - which 'abundance' do you want? The one from the read_id
>> of the first reads2expts? Or the last one? Or all of them?
>
> The one that's associated with expt_id = 2
>
>> Possibilities would be:
>>
>> my $col = $rs[0]->read_id->reads2expts->first->read_id->get_column('abundance');
>> my @cols = $rs[0]->read_id->reads2expts->search_related ('read_id')->get_column('abundance')->all;
>>
>> You get the idea
>
> I do. Got to it with this:
> $rs[0]->read_id->reads2expts->search_related ('read_id', {expt_id =>
> 2})->get_column('abundance')->first
>
> What I don't understand is why the expt_id => 2 in the search() doesn't
> limit the join on the reads2expt table? It needs me to do the filtering
> twice. The SQL returns the right thing, so why doesn't DBIC?

I am not sure what do you mean by twice, and I am not sure why you
stuck a condition into search_related. Show generated SQL.

_______________________________________________
List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
IRC: irc.perl.org#dbix-class
SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
Searchable Archive: http://www.grokbase.com/group/dbix-class@...
Chris Cole

Re: Complex join problem

Reply Threaded More More options
Print post
Permalink
Peter Rabbitson wrote:

> Chris Cole wrote:
>> Peter Rabbitson wrote:
>>> Chris Cole wrote:
>>>> Hi all,
>>>>
>>>> I'm managing to 'cascade' joins along relation's relationships, but only
>>>> so far. Is there a limit?
>>>>
>>>> The table definitions are at the bottom, but I'm trying to convert this
>>>> SQL into DBIx:
>>>> SELECT
>>>> db_match.read_id,db_match.seq_id,match_start,match_end,query_start,mismatch,abundance
>>>> FROM db_match
>>>> JOIN seq_reads ON db_match.read_id = seq_reads.read_id
>>>> JOIN reads2expt ON seq_reads.read_id = reads2expt.read_id
>>>> WHERE expt_id = 2
>>>>
>>>> The code have is:
>>>> my @rs = $schema->resultset('DbMatch')->search(
>>>>    {
>>>>       expt_id => 2,
>>>>    },
>>>>    {
>>>>       join => {         # join on seq_reads -> reads2expt tables
>>>>          read_id => {
>>>>            reads2expts => 'read_id'
>>>>          }
>>>>       },
>>>>    }
>>>> );
>>>>
>>>> This works, in that I get the right rows back, but I cannot retrieve any
>>>> specific column values from the reads2expt table (e.g. 'abundance').
>>>> Neither with a get_column() method nor directly. I've tried things like
>>>> the following with no joy:
>>>>
>>>> $rs[0]->abundance;
>>>> $rs[0]->read_id->reads2expts->read_id->get_column('abundance');
>>>>
>>>> I can get get the data from the other tables no problem, it's just the
>>>> reads2expt one. Anyone know where I'm going wrong?
>>>> Any help is most appreciated.
>>> Your SeqReads has_many Reads2expt, thus the call:
>>>
>>> read_id->reads2expts returns a ResultSet, not a Row.
>> Ah, OK.
>>
>>> Now the real
>>> question is - which 'abundance' do you want? The one from the read_id
>>> of the first reads2expts? Or the last one? Or all of them?
>> The one that's associated with expt_id = 2
>>
>>> Possibilities would be:
>>>
>>> my $col = $rs[0]->read_id->reads2expts->first->read_id->get_column('abundance');
>>> my @cols = $rs[0]->read_id->reads2expts->search_related ('read_id')->get_column('abundance')->all;
>>>
>>> You get the idea
>> I do. Got to it with this:
>> $rs[0]->read_id->reads2expts->search_related ('read_id', {expt_id =>
>> 2})->get_column('abundance')->first
>>
>> What I don't understand is why the expt_id => 2 in the search() doesn't
>> limit the join on the reads2expt table? It needs me to do the filtering
>> twice. The SQL returns the right thing, so why doesn't DBIC?
>
> I am not sure what do you mean by twice, and I am not sure why you
> stuck a condition into search_related. Show generated SQL.

Because otherwise it returned all 'read_ids' that matched in the
reads2expt table. The 'abundance' value must only be for rows where
expt_id = 2 in that table. I don't understand why that requirement isn't
being fed through.

SQL:
SELECT me.seq_id, me.read_id, me.search_id, me.match_start,
me.match_end, me.query_start, me.mismatch FROM db_match me  JOIN
seq_reads read_id ON read_id.read_id = me.read_id LEFT JOIN reads2expt
reads2expts ON reads2expts.read_id = read_id.read_id LEFT JOIN seq_reads
read_id_2 ON read_id_2.read_id = reads2expts.read_id  JOIN db_sequences
seq_id ON seq_id.seq_id = me.seq_id WHERE ( ( expt_id = ? AND type = ? )
): '2', 'hairpin-miRNA'

followed by lots of lines like below. Presumably from the
search_related() step:
SELECT me.read_id, me.length, me.seq FROM seq_reads me WHERE (
me.read_id = ? ): '17'
SELECT abundance FROM reads2expt me  JOIN seq_reads read_id ON
read_id.read_id = me.read_id WHERE ( ( expt_id = ? AND me.read_id = ? )
): '2', '17'



_______________________________________________
List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
IRC: irc.perl.org#dbix-class
SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
Searchable Archive: http://www.grokbase.com/group/dbix-class@...
Peter Rabbitson-2

Re: Complex join problem

Reply Threaded More More options
Print post
Permalink
Chris Cole wrote:

> Peter Rabbitson wrote:
>> Chris Cole wrote:
>>> Peter Rabbitson wrote:
>>>> Chris Cole wrote:
>>>>> Hi all,
>>>>>
>>>>> I'm managing to 'cascade' joins along relation's relationships, but only
>>>>> so far. Is there a limit?
>>>>>
>>>>> The table definitions are at the bottom, but I'm trying to convert this
>>>>> SQL into DBIx:
>>>>> SELECT
>>>>> db_match.read_id,db_match.seq_id,match_start,match_end,query_start,mismatch,abundance
>>>>> FROM db_match
>>>>> JOIN seq_reads ON db_match.read_id = seq_reads.read_id
>>>>> JOIN reads2expt ON seq_reads.read_id = reads2expt.read_id
>>>>> WHERE expt_id = 2
>>>>>
>>>>> The code have is:
>>>>> my @rs = $schema->resultset('DbMatch')->search(
>>>>>    {
>>>>>       expt_id => 2,
>>>>>    },
>>>>>    {
>>>>>       join => {         # join on seq_reads -> reads2expt tables
>>>>>          read_id => {
>>>>>            reads2expts => 'read_id'
>>>>>          }
>>>>>       },
>>>>>    }
>>>>> );
>>>>>
>>>>> This works, in that I get the right rows back, but I cannot retrieve any
>>>>> specific column values from the reads2expt table (e.g. 'abundance').
>>>>> Neither with a get_column() method nor directly. I've tried things like
>>>>> the following with no joy:
>>>>>
>>>>> $rs[0]->abundance;
>>>>> $rs[0]->read_id->reads2expts->read_id->get_column('abundance');
>>>>>
>>>>> I can get get the data from the other tables no problem, it's just the
>>>>> reads2expt one. Anyone know where I'm going wrong?
>>>>> Any help is most appreciated.
>>>> Your SeqReads has_many Reads2expt, thus the call:
>>>>
>>>> read_id->reads2expts returns a ResultSet, not a Row.
>>> Ah, OK.
>>>
>>>> Now the real
>>>> question is - which 'abundance' do you want? The one from the read_id
>>>> of the first reads2expts? Or the last one? Or all of them?
>>> The one that's associated with expt_id = 2
>>>
>>>> Possibilities would be:
>>>>
>>>> my $col = $rs[0]->read_id->reads2expts->first->read_id->get_column('abundance');
>>>> my @cols = $rs[0]->read_id->reads2expts->search_related ('read_id')->get_column('abundance')->all;
>>>>
>>>> You get the idea
>>> I do. Got to it with this:
>>> $rs[0]->read_id->reads2expts->search_related ('read_id', {expt_id =>
>>> 2})->get_column('abundance')->first
>>>
>>> What I don't understand is why the expt_id => 2 in the search() doesn't
>>> limit the join on the reads2expt table? It needs me to do the filtering
>>> twice. The SQL returns the right thing, so why doesn't DBIC?
>> I am not sure what do you mean by twice, and I am not sure why you
>> stuck a condition into search_related. Show generated SQL.
>
> Because otherwise it returned all 'read_ids' that matched in the
> reads2expt table. The 'abundance' value must only be for rows where
> expt_id = 2 in that table. I don't understand why that requirement isn't
> being fed through.
>
> SQL:
> SELECT me.seq_id, me.read_id, me.search_id, me.match_start,
> me.match_end, me.query_start, me.mismatch FROM db_match me  JOIN
> seq_reads read_id ON read_id.read_id = me.read_id LEFT JOIN reads2expt
> reads2expts ON reads2expts.read_id = read_id.read_id LEFT JOIN seq_reads
> read_id_2 ON read_id_2.read_id = reads2expts.read_id  JOIN db_sequences
> seq_id ON seq_id.seq_id = me.seq_id WHERE ( ( expt_id = ? AND type = ? )
> ): '2', 'hairpin-miRNA'
>
> followed by lots of lines like below. Presumably from the
> search_related() step:
> SELECT me.read_id, me.length, me.seq FROM seq_reads me WHERE (
> me.read_id = ? ): '17'
> SELECT abundance FROM reads2expt me  JOIN seq_reads read_id ON
> read_id.read_id = me.read_id WHERE ( ( expt_id = ? AND me.read_id = ? )
> ): '2', '17'
>
>

You are not prefetching properly somewhere. Yet this is getting extremely
confusing - please start fresh showing a "slow but correct" dbic query
(the entire search chain) and the resulting SQL (presumably the one above).

_______________________________________________
List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
IRC: irc.perl.org#dbix-class
SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
Searchable Archive: http://www.grokbase.com/group/dbix-class@...
Chris Cole

Re: Complex join problem

Reply Threaded More More options
Print post
Permalink
Peter Rabbitson wrote:
> You are not prefetching properly somewhere. Yet this is getting extremely
> confusing - please start fresh showing a "slow but correct" dbic query
> (the entire search chain) and the resulting SQL (presumably the one above).

Ok. Sorry for not getting back earlier, I've been sidetracked with
something else.

The problem stems in trying to retrieve columns from the reads2expt
table in a SELECT.

The below works as expected:

my $schema = DB::Schema->connect();
my @data;
my @rs = $schema->resultset('DbMatch')->search(
   {
      expt_id => $dataset,
      abundance => {'>' => 1},
      type => 'hairpin-miRNA'
   },
   {
      join => [
         {         # join on seq_reads -> reads2expt tables
            read_id => {
               reads2expts => 'read_id'
            }
         },
         'seq_id'  # join db_sequences table.
      ]
   }
);

foreach my $hit (@rs) {
   push @data, [ $hit->get_column('seq_id'), $hit->match_start ];
}
return(\@data);

The SQL for the above is:
SELECT me.seq_id, me.read_id, me.search_id, me.match_start,
me.match_end, me.query_start, me.mismatch FROM db_match me  JOIN
seq_reads read_id ON read_id.read_id = me.read_id LEFT JOIN reads2expt
reads2expts ON reads2expts.read_id = read_id.read_id LEFT JOIN seq_reads
read_id_2 ON read_id_2.read_id = reads2expts.read_id  JOIN db_sequences
seq_id ON seq_id.seq_id = me.seq_id WHERE ( ( abundance > ? AND expt_id
= ? AND type = ? ) ): '1', '1', 'hairpin-miRNA'


However, if I want to get the value for the 'abundance' column for each
row I need to use a search_related() step, which dramatically slows down
the process.


my $schema = DB::Schema->connect();
my @data;
my @rs = $schema->resultset('DbMatch')->search(
   {
      expt_id => $dataset,
      abundance => {'>' => 1},
      type => 'hairpin-miRNA'
   },
   {
      join => [
         {         # join on seq_reads -> reads2expt tables
            read_id => {
               reads2expts => 'read_id'
            }
         },
         'seq_id'  # join db_sequences table.
      ]
   }
);

foreach my $hit (@rs) {
   push @data, [ $hit->get_column('seq_id'), $hit->match_start,
$hit->read_id->reads2expts->search_related('read_id')->get_column('abundance')->first
];
}
return(\@data);

The SQL:
SELECT me.seq_id, me.read_id, me.search_id, me.match_start,
me.match_end, me.query_start, me.mismatch FROM db_match me  JOIN
seq_reads read_id ON read_id.read_id = me.read_id LEFT JOIN reads2expt
reads2expts ON reads2expts.read_id = read_id.read_id LEFT JOIN seq_reads
read_id_2 ON read_id_2.read_id = reads2expts.read_id  JOIN db_sequences
seq_id ON seq_id.seq_id = me.seq_id WHERE ( ( abundance > ? AND expt_id
= ? AND type = ? ) ): '1', '1', 'hairpin-miRNA'
SELECT me.read_id, me.length, me.seq FROM seq_reads me WHERE (
me.read_id = ? ): '2'
SELECT abundance FROM reads2expt me  JOIN seq_reads read_id ON
read_id.read_id = me.read_id WHERE ( me.read_id = ? ): '2'
SELECT me.read_id, me.length, me.seq FROM seq_reads me WHERE (
me.read_id = ? ): '4'
SELECT abundance FROM reads2expt me  JOIN seq_reads read_id ON
read_id.read_id = me.read_id WHERE ( me.read_id = ? ): '4'
SELECT me.read_id, me.length, me.seq FROM seq_reads me WHERE (
me.read_id = ? ): '4'
SELECT abundance FROM reads2expt me  JOIN seq_reads read_id ON
read_id.read_id = me.read_id WHERE ( me.read_id = ? ): '4'
SELECT me.read_id, me.length, me.seq FROM seq_reads me WHERE (
me.read_id = ? ): '4'
SELECT abundance FROM reads2expt me  JOIN seq_reads read_id ON
read_id.read_id = me.read_id WHERE ( me.read_id = ? ): '4'
SELECT me.read_id, me.length, me.seq FROM seq_reads me WHERE (
me.read_id = ? ): '4'
SELECT abundance FROM reads2expt me  JOIN seq_reads read_id ON
read_id.read_id = me.read_id WHERE ( me.read_id = ? ): '4'
SELECT me.read_id, me.length, me.seq FROM seq_reads me WHERE (
me.read_id = ? ): '5'
SELECT abundance FROM reads2expt me  JOIN seq_reads read_id ON
read_id.read_id = me.read_id WHERE ( me.read_id = ? ): '5'
SELECT me.read_id, me.length, me.seq FROM seq_reads me WHERE (
me.read_id = ? ): '8'
SELECT abundance FROM reads2expt me  JOIN seq_reads read_id ON
read_id.read_id = me.read_id WHERE ( me.read_id = ? ): '8'
etc...

I don't understand why it needs all the extra SELECTs, but is presumably
the cause for the slow-down. Also, I don't understand why I can specify
'abundance' as a search parameter, but then have to really hunt for the
'abundance' values for each row.

Any idea how I can speed this up and remove the need to use the
search_related() to hunt down the column values? This is quite a
simplified version of what I'd really like to do, so would like to be
able to do this more simply than using SQL statements via the DBI.

Table schemata are below for completeness.
Cheers,

Chris

__PACKAGE__->load_components("Core");
__PACKAGE__->table("db_match");
__PACKAGE__->add_columns(
  "seq_id",
  { data_type => "INT", default_value => "", is_nullable => 0, size => 10 },
  "read_id",
  { data_type => "BIGINT", default_value => "", is_nullable => 0, size
=> 20 },
  "search_id",
  { data_type => "SMALLINT", default_value => "", is_nullable => 0, size
=> 5 },
  "match_start",
  { data_type => "SMALLINT", default_value => "", is_nullable => 0, size
=> 5 },
  "match_end",
  { data_type => "SMALLINT", default_value => "", is_nullable => 0, size
=> 5 },
  "query_start",
  { data_type => "SMALLINT", default_value => "", is_nullable => 0, size
=> 5 },
  "mismatch",
  { data_type => "TINYINT", default_value => "", is_nullable => 0, size
=> 3 },
);
__PACKAGE__->set_primary_key("seq_id", "read_id", "search_id",
"match_start");
__PACKAGE__->belongs_to("seq_id", "DB::Schema::DbSequences", { seq_id =>
"seq_id" });
__PACKAGE__->belongs_to("read_id", "DB::Schema::SeqReads", { read_id =>
"read_id" });
__PACKAGE__->belongs_to(
  "search_id",
  "DB::Schema::SearchParams",
  { search_id => "search_id" },
);


__PACKAGE__->load_components("Core");
__PACKAGE__->table("seq_reads");
__PACKAGE__->add_columns(
  "read_id",
  { data_type => "BIGINT", default_value => undef, is_nullable => 0,
size => 20 },
  "length",
  { data_type => "SMALLINT", default_value => "", is_nullable => 0, size
=> 5 },
  "seq",
  { data_type => "VARCHAR", default_value => "", is_nullable => 0, size
=> 255 },
);
__PACKAGE__->set_primary_key("read_id");
__PACKAGE__->has_many(
  "db_matches",
  "DB::Schema::DbMatch",
  { "foreign.read_id" => "self.read_id" },
);
__PACKAGE__->has_many(
  "genome_matches",
  "DB::Schema::GenomeMatch",
  { "foreign.read_id" => "self.read_id" },
);
__PACKAGE__->has_many(
  "reads2expts",
  "DB::Schema::Reads2expt",
  { "foreign.read_id" => "self.read_id" },
);

__PACKAGE__->load_components("Core");
__PACKAGE__->table("reads2expt");
__PACKAGE__->add_columns(
  "expt_id",
  { data_type => "SMALLINT", default_value => "", is_nullable => 0, size
=> 5 },
  "read_id",
  { data_type => "BIGINT", default_value => "", is_nullable => 0, size
=> 20 },
  "abundance",
  { data_type => "INT", default_value => "", is_nullable => 0, size => 10 },
  "read_name",
  { data_type => "VARCHAR", default_value => "", is_nullable => 0, size
=> 255 },
);
__PACKAGE__->belongs_to("expt_id", "DB::Schema::Experiment", { expt_id
=> "expt_id" });
__PACKAGE__->belongs_to("read_id", "DB::Schema::SeqReads", { read_id =>
"read_id" });

__PACKAGE__->load_components("Core");
__PACKAGE__->table("db_sequences");
__PACKAGE__->add_columns(
  "seq_id",
  { data_type => "INT", default_value => undef, is_nullable => 0, size
=> 10 },
  "accession",
  { data_type => "VARCHAR", default_value => "", is_nullable => 0, size
=> 255 },
  "type",
  { data_type => "VARCHAR", default_value => "", is_nullable => 0, size
=> 255 },
  "seq",
  { data_type => "TEXT", default_value => "", is_nullable => 0, size =>
65535 },
  "db_id",
  { data_type => "SMALLINT", default_value => "", is_nullable => 0, size
=> 5 },
  "species_id",
  {
    data_type => "SMALLINT",
    default_value => undef,
    is_nullable => 1,
    size => 5,
  },
);
__PACKAGE__->set_primary_key("seq_id");
__PACKAGE__->has_many(
  "db_matches",
  "DB::Schema::DbMatch",
  { "foreign.seq_id" => "self.seq_id" },
);
__PACKAGE__->belongs_to("db_id", "DB::Schema::ExtrnDb", { db_id =>
"db_id" });
__PACKAGE__->belongs_to(
  "species_id",
  "DB::Schema::Species",
  { species_id => "species_id" },
);
__PACKAGE__->has_many(
  "mature2hairpin_mature_ids",
  "DB::Schema::Mature2hairpin",
  { "foreign.mature_id" => "self.seq_id" },
);
__PACKAGE__->has_many(
  "mature2hairpin_hairpin_ids",
  "DB::Schema::Mature2hairpin",
  { "foreign.hairpin_id" => "self.seq_id" },
);



_______________________________________________
List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
IRC: irc.perl.org#dbix-class
SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
Searchable Archive: http://www.grokbase.com/group/dbix-class@...
Peter Rabbitson-2

Re: Complex join problem

Reply Threaded More More options
Print post
Permalink
Chris Cole wrote:

> Peter Rabbitson wrote:
>> You are not prefetching properly somewhere. Yet this is getting extremely
>> confusing - please start fresh showing a "slow but correct" dbic query
>> (the entire search chain) and the resulting SQL (presumably the one above).
>
> Ok. Sorry for not getting back earlier, I've been sidetracked with
> something else.
>
> The problem stems in trying to retrieve columns from the reads2expt
> table in a SELECT.
>
> The below works as expected:
>
> my $schema = DB::Schema->connect();
> my @data;
> my @rs = $schema->resultset('DbMatch')->search(
>    {
>       expt_id => $dataset,
>       abundance => {'>' => 1},
>       type => 'hairpin-miRNA'
>    },
>    {
>       join => [
>          {         # join on seq_reads -> reads2expt tables
>             read_id => {
>                reads2expts => 'read_id'
>             }
>          },
>          'seq_id'  # join db_sequences table.
>       ]
>    }
> );
>
> foreach my $hit (@rs) {
>    push @data, [ $hit->get_column('seq_id'), $hit->match_start ];
> }
> return(\@data);
>
> The SQL for the above is:
> SELECT me.seq_id, me.read_id, me.search_id, me.match_start,
> me.match_end, me.query_start, me.mismatch FROM db_match me  JOIN
> seq_reads read_id ON read_id.read_id = me.read_id LEFT JOIN reads2expt
> reads2expts ON reads2expts.read_id = read_id.read_id LEFT JOIN seq_reads
> read_id_2 ON read_id_2.read_id = reads2expts.read_id  JOIN db_sequences
> seq_id ON seq_id.seq_id = me.seq_id WHERE ( ( abundance > ? AND expt_id
> = ? AND type = ? ) ): '1', '1', 'hairpin-miRNA'
>
>
> However, if I want to get the value for the 'abundance' column for each
> row I need to use a search_related() step, which dramatically slows down
> the process.
>
>

Read up on this: http://search.cpan.org/~ribasushi/DBIx-Class-0.08103/lib/DBIx/Class/Manual/Cookbook.pod#Using_joins_and_prefetch

_______________________________________________
List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
IRC: irc.perl.org#dbix-class
SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
Searchable Archive: http://www.grokbase.com/group/dbix-class@...
Chris Cole

Re: Complex join problem

Reply Threaded More More options
Print post
Permalink
Peter Rabbitson wrote:

> Chris Cole wrote:
>> Peter Rabbitson wrote:
>>> You are not prefetching properly somewhere. Yet this is getting extremely
>>> confusing - please start fresh showing a "slow but correct" dbic query
>>> (the entire search chain) and the resulting SQL (presumably the one above).
>> Ok. Sorry for not getting back earlier, I've been sidetracked with
>> something else.
>>
>> The problem stems in trying to retrieve columns from the reads2expt
>> table in a SELECT.
>>
>> The below works as expected:
>>
>> my $schema = DB::Schema->connect();
>> my @data;
>> my @rs = $schema->resultset('DbMatch')->search(
>>    {
>>       expt_id => $dataset,
>>       abundance => {'>' => 1},
>>       type => 'hairpin-miRNA'
>>    },
>>    {
>>       join => [
>>          {         # join on seq_reads -> reads2expt tables
>>             read_id => {
>>                reads2expts => 'read_id'
>>             }
>>          },
>>          'seq_id'  # join db_sequences table.
>>       ]
>>    }
>> );
>>
>> foreach my $hit (@rs) {
>>    push @data, [ $hit->get_column('seq_id'), $hit->match_start ];
>> }
>> return(\@data);
>>
>> The SQL for the above is:
>> SELECT me.seq_id, me.read_id, me.search_id, me.match_start,
>> me.match_end, me.query_start, me.mismatch FROM db_match me  JOIN
>> seq_reads read_id ON read_id.read_id = me.read_id LEFT JOIN reads2expt
>> reads2expts ON reads2expts.read_id = read_id.read_id LEFT JOIN seq_reads
>> read_id_2 ON read_id_2.read_id = reads2expts.read_id  JOIN db_sequences
>> seq_id ON seq_id.seq_id = me.seq_id WHERE ( ( abundance > ? AND expt_id
>> = ? AND type = ? ) ): '1', '1', 'hairpin-miRNA'
>>
>>
>> However, if I want to get the value for the 'abundance' column for each
>> row I need to use a search_related() step, which dramatically slows down
>> the process.
>>
>>
>
> Read up on this: http://search.cpan.org/~ribasushi/DBIx-Class-0.08103/lib/DBIx/Class/Manual/Cookbook.pod#Using_joins_and_prefetch

Ah, right. Thanks.

The prefetch is working as far as I can tell, but can't seem to access
the relationship. e.g.

    my $schema = DB::Schema->connect();

    my @data;
    my @rs = $schema->resultset('DbMatch')->search(
       {
          expt_id => $dataset,
          abundance => {'>' => 1},
          type => 'hairpin-miRNA'
       },
       {
          join => [
             {         # join on seq_reads -> reads2expt tables
                read_id => {
                   reads2expts => 'read_id'
                }
             },
             'seq_id'  # join db_sequences table. Need this in order to
filter by 'type'
          ],
          prefetch => [
             {
               read_id => {
                   reads2expts => 'read_id'
                }
             },
          ]
       },
    );

The SQL from this is:
SELECT me.seq_id, me.read_id, me.search_id, me.match_start,
me.match_end, me.query_start, me.mismatch, read_id.read_id,
read_id.length, read_id.seq, reads2expts.expt_id, reads2expts.read_id,
reads2expts.abundance, reads2expts.read_name, read_id_2.read_id,
read_id_2.length, read_id_2.seq FROM db_match me  JOIN seq_reads read_id
ON ( read_id.read_id = me.read_id ) LEFT JOIN reads2expt reads2expts ON
( reads2expts.read_id = read_id.read_id )  JOIN seq_reads read_id_2 ON (
read_id_2.read_id = reads2expts.read_id )  JOIN db_sequences seq_id ON (
seq_id.seq_id = me.seq_id ) WHERE ( abundance > ? AND expt_id = ? AND
type = ? ) ORDER BY reads2expts.read_id: '1', '1', 'hairpin-miRNA'

But I can't get the syntax right for accessing the 'abundance' column.
According to the docs, something like this should work:
foreach my $hit (@rs) {
   push @data, [ $hit->get_column('seq_id'), $hit->match_start,
$hit->read_id->reads2expts->read_id->abundance ];
}

I've tried lots of different variations of the above to access the
'abundance' data, but I always get a 'Can't locate object method...'
error. I really can't see where I going wrong.

Thanks very much for your continued help.
Cheers,

Chris

_______________________________________________
List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
IRC: irc.perl.org#dbix-class
SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
Searchable Archive: http://www.grokbase.com/group/dbix-class@...