|
View:
New views
9 Messages
—
Rating Filter:
Alert me
|
|
|
|
|
|
Re: Feature ListsHi Ben,
Did somebody answer your question? Chado doesn't have a single convention for grouping sets of features. Different kinds of feature are grouped differently. Do you have some examples of what you want to do that you'd be willing to share, and we could work from those? -Dave On Sat, 28 Jun 2008 5:11, Ben Woodcroft wrote: > Hi, > > Is there any table or accepted best practise in Chado for storing sets > of features? > > Thanks, > ben > > ------------------------------------------------------------------------- > Check out the new SourceForge.net Marketplace. > It's the best place to buy or sell services for > just about anything Open Source. > http://sourceforge.net/services/buy/index.php > _______________________________________________ > Gmod-schema mailing list > Gmod-schema@... > https://lists.sourceforge.net/lists/listinfo/gmod-schema ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://sourceforge.net/services/buy/index.php _______________________________________________ Gmod-schema mailing list Gmod-schema@... https://lists.sourceforge.net/lists/listinfo/gmod-schema |
|
|
|
|
|
|
|
|
Re: Feature ListsHi Dave, I think it's worth noting that the method described by Jay (i.e., linking the desired set of chado features to a new feature that represents the group, via feature_relationships of type "member_of") has been used both at JCVI(1) (formerly TIGR) and, from what I remember, at a couple of other sites too, to represent protein/polypeptide clusters (e.g., for putative groups of orthologous polypeptides). The reason for grouping features may be different in the current instance, but I don't immediately see a reason why the same implementation couldn't be used for both protein clusters and also more generally-defined groups of features. If there's agreement on that point then I, for one, would argue for using the implementation that has already been tested in a couple of places for the protein cluster case. I also don't think the cvterm-based approach you described would work as well for the protein cluster case, since there is no analysiscvterm table, which you'd need in order to represent groups generated by an algorithm. Jonathan (1) Actually the JCVI/TIGR implementation used featureloc instead of feature_relationship, but in retrospect this turned out to be unnecessary, as no coordinate information was stored, and plus there's a natural fit IMO in using "member_of" feature_relationships to represent group membership. The minus, of course, is that you are expanding the interpretation of "feature" to include groups of features. On Wed, Jul 2, 2008 at 9:11 AM, David Emmert <emmert@...> wrote:
------------------------------------------------------------------------- Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW! Studies have shown that voting for your favorite open source project, along with a healthy diet, reduces your potential for chronic lameness and boredom. Vote Now at http://www.sourceforge.net/community/cca08 _______________________________________________ Gmod-schema mailing list Gmod-schema@... https://lists.sourceforge.net/lists/listinfo/gmod-schema |
|
|
|
|
|
Re: Feature ListsHi Dave, I agree with you completely that some discipline is needed to prevent the schema from becoming a complete free-for-all. Where I think we disagree is on the question of what types of usage would violate (or have violated) the original definition of "feature." Your position, if I understand it correctly, is that: 1. Representing algorithmically-derived protein clusters as features is a "departure" from the original definition of feature but is acceptable "because at least they're a sort of abstraction of valid features." 2. Representing arbitrary groups of features as a feature would "utterly break" the definition of "feature", because such groups of features need not have any biological relevance. So it sounds as though your principle objection (to using the feature table to represent arbitrary feature groups) is the (potential) lack of biological relevance in the proposed user-defined groups of features. My (current) view, on the other hand, is that: 1. Representing algorithmically-derived protein clusters as features breaks the original definition of feature. 2. Representing arbitrary groups of features as features breaks the original definition of feature. 3. Given that the cat (1.) is already out of the bag, so to speak, why not also allow 2.? In other words, I see the exception currently granted for protein clusters as implicitly broadening the definition of feature from "biological sequence or something localizable to a biological sequence" to "biological sequence or something localizable to a biological sequence OR a group of such sequences and/or features". I don't think it's a necessary condition for the group itself to have biological relevance. (I have certainly seen automatically-generated protein clusters in the past whose biological relevance was questionable at best!) Or, to put it another way, if a scientist comes along and says "Hey, I think this group of genes is worth looking at right now because of X,Y, and Z" then shouldn't that in and of itself constitute sufficient biological relevance? (I believe this is the use-case under consideration, with the caveat that the scientist won't necessarily tell you the "X,Y, and Z" part.) If that's not a sufficiently compelling argument, then consider the following questions: 1. How do you propose to define "a sort of abstraction of valid feature" in a sufficiently rigorous manner to allow future chado users to tell which departures from the definition of "feature" are acceptable and which utterly break the definition? 2. If biological relevance is the crucial criterion that determines whether a group of features is allowed to be represented as a feature via feature_relationship, then doesn't this imply that _some_ user-defined feature groups should be represented as features and some should be represented via the cvterm mechanism you described? What if we required the end-user to check a box in the user interface that said "I certify that this group of genes has biological relevance."? Would you then allow it to be stored as a feature, but make it a cvterm if the user failed to check the box? I'm taking things to the logical extreme here, of course, but I'm trying to make the point that--when applied to groups of biological sequences or features--the term "biological relevance" is something that different people are liable to disagree over. To me at least, the crucial question here is simply "How should we represent groups of features in chado?" and given that a protein cluster is a special case of a group of features, I would prefer that there be a single answer to this question. What do you think? Am I misconstruing what you mean by biological relevance somehow? Jonathan On Wed, Jul 2, 2008 at 2:19 PM, David Emmert <emmert@...> wrote: Hi Jonathan, ------------------------------------------------------------------------- Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW! Studies have shown that voting for your favorite open source project, along with a healthy diet, reduces your potential for chronic lameness and boredom. Vote Now at http://www.sourceforge.net/community/cca08 _______________________________________________ Gmod-schema mailing list Gmod-schema@... https://lists.sourceforge.net/lists/listinfo/gmod-schema |
|
|
|
|
|
|
| Free Forum Powered by Nabble | Forum Help |