|
View:
New views
2 Messages
—
Rating Filter:
Alert me
|
|
|
The fill_cvtermpath stored procedureI just noticed that the fill_cvtermpath stored procedure doesn't
properly take account of relationship types: it thinks that a centromere is_a chromosome, for example. Attached is a proposed replacement procedure that behaves as I believe it should (according to the docs on the wiki). I haven't put any effort into making it fast, on the grounds that it's not likely to be used frequently, though no doubt it could be made faster if necessary. The result of this is more useful, certainly for what I'm using it for now. (I just need the reflexive transitive closure of the is_a relation.) But it's still rather unsatisfactory in some ways. In particular, it assumes that all relations are reflexive and transitive, which isn't true. For example 'non_functional_homolog_of' is not reflexive, and 'adjacent_to' is neither. The cvtermpath entries for relations of this sort are at least partially nonsensical. At least for the relationship ontology types, we do know which of them are reflexive and which transitive, because it's in the OBO file. (We also know that, for example, has_part is the inverse of part_of.) Is there any reason not to take this information into account when populating cvtermpath? The other issue is that of relations between the relation types themselves. If A `proper_part_of` B, then certainly A `part_of` B, for example. It would be useful if cvtermpath could also include derived relations of this sort. If this has been discussed ad nauseam in the past, I apologise for reopening old wounds. :-) Robin -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. PS. I'm aware that there's also a Perl script make_cvtermpath.pl included in the Chado distribution. I haven't examined it in detail. From a cursory glance, it seems it wouldn't work as intended, since it assumes that the name of the relationship ontology CV is 'Relationship Ontology' and it expects there to be a cvterm whose name is 'OBO_REL:0001'. ------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ _______________________________________________ Gmod-schema mailing list Gmod-schema@... https://lists.sourceforge.net/lists/listinfo/gmod-schema |
|
|
Re: The fill_cvtermpath stored procedureHi Robin
In fact this has not been discussed much. I'm not sure what procedures are currently used to fill the cvtermpath table. I suspect there are many groups who are not populating this at all, and thus not taking advantage of the ontology graphs in their annotations. Thanks for the patch. I'm not sure if the procedure will continue to be used much because (a) people seem to have a gut reaction against procedural code in the dbms (b) there are problems with large closures : the procedure has to run inside a transaction (c) as you say it has to make certain assumptions about relations that may not be true. I would recommend instead moving towards using an external tool with the necessary logic to build the full transitive closure. The documentation on how to do this with the GO database is applicable to Chado: http://wiki.geneontology.org/index.php/Transitive_closure You can use oboedit2 in command line mode to generate a closure table that can easily be slurped into Chado (OK, we are lacking the script but it should be simple). Or if you are using an ontology registered with OBO, there are closure tables already built for you. See the above wiki for details. The advantage of using this approach is you don't have to reimplement the reasoning logic. It should do the right thing as far as relations are concerned (taking into account sub-relations, relation compositon, etc) On Sep 9, 2008, at 2:49 PM, Robin Houston wrote: > I just noticed that the fill_cvtermpath stored procedure doesn't > properly take account of relationship types: it thinks that a > centromere is_a chromosome, for example. > > Attached is a proposed replacement procedure that behaves as I > believe it should (according to the docs on the wiki). I haven't put > any effort into making it fast, on the grounds that it's not likely > to be used frequently, though no doubt it could be made faster if > necessary. > > The result of this is more useful, certainly for what I'm using it > for now. (I just need the reflexive transitive closure of the is_a > relation.) But it's still rather unsatisfactory in some ways. In > particular, it assumes that all relations are reflexive and > transitive, which isn't true. For example > 'non_functional_homolog_of' is not reflexive, and 'adjacent_to' is > neither. The cvtermpath entries for relations of this sort are at > least partially nonsensical. > > At least for the relationship ontology types, we do know which of > them are reflexive and which transitive, because it's in the OBO > file. (We also know that, for example, has_part is the inverse of > part_of.) Is there any reason not to take this information into > account when populating cvtermpath? > > The other issue is that of relations between the relation types > themselves. If A `proper_part_of` B, then certainly A `part_of` B, > for example. It would be useful if cvtermpath could also include > derived relations of this sort. > > If this has been discussed ad nauseam in the past, I apologise for > reopening old wounds. :-) > > Robin > > > > > -- > The Wellcome Trust Sanger Institute is operated by Genome > ResearchLimited, a charity registered in England with number 1021457 > and acompany registered in England with number 2742969, whose > registeredoffice is 215 Euston Road, London, NW1 2BE. > <fill_cvtermpath.pgplsql> > > > > PS. I'm aware that there's also a Perl script make_cvtermpath.pl > included in the Chado distribution. I haven't examined it in detail. > From a cursory glance, it seems it wouldn't work as intended, since > it assumes that the name of the relationship ontology CV is > 'Relationship Ontology' and it expects there to be a cvterm whose > name is 'OBO_REL:0001'. > > ------------------------------------------------------------------------- > This SF.Net email is sponsored by the Moblin Your Move Developer's > challenge > Build the coolest Linux based applications with Moblin SDK & win > great prizes > Grand prize is a trip for two to an Open Source event anywhere in > the world > http://moblin-contest.org/redirect.php?banner_id=100&url=/_______________________________________________ > Gmod-schema mailing list > Gmod-schema@... > https://lists.sourceforge.net/lists/listinfo/gmod-schema ------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ _______________________________________________ Gmod-schema mailing list Gmod-schema@... https://lists.sourceforge.net/lists/listinfo/gmod-schema |
| Free Forum Powered by Nabble | Forum Help |