Re: [Cdk-commits] SF.net SVN: cdk: [11547] cdk/branches/drzz-jumbo5.4

View: New views
5 Messages — Rating Filter:   Alert me  

Parent Message unknown Re: [Cdk-commits] SF.net SVN: cdk: [11547] cdk/branches/drzz-jumbo5.4

by Egon Willighagen-5 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Dan,

On Fri, Jul 11, 2008 at 4:42 PM,  <drzz@...> wrote:>
@@ -133,7 +133,7 @@
>                        CMLAtomArray ar=new CMLAtomArray();
>                        for (int i=0; i<chargeGroup.getAtomCount();i++){
>                                CMLAtom cmlAtom=new CMLAtom();
> -                               cmlAtom.setRef(chargeGroup.getAtom(i).getID());
> +                               cmlAtom.setId(chargeGroup.getAtom(i).getID());
>
>                                //Append switching atom?
>                                if (chargeGroup.getAtom(i).equals(chargeGroup.getSwitchingAtom())) {

This change should have better been submitted as separate change, as
it actually changes the way to code behaves... I assume it fixes an
incorrect use of CML, which is fine of course, but does deserve at
least a commit message, but really a unit test too, to ensure that the
same mistake is not made again...

Interestingly, the use of @ref might be correct in certain CML uses,
where later atoms are actually refering to earlier definitions... if
not mistaken, this might be used for certain reactions...

Maybe the best way to resolve this 'issue' is to see who commited that
setRef() line, and ask why setRef() was used there...

You can use 'svn annotate' to see who last changed that line...

Egon

--
----
http://chem-bla-ics.blogspot.com/

-------------------------------------------------------------------------
Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW!
Studies have shown that voting for your favorite open source project,
along with a healthy diet, reduces your potential for chronic lameness
and boredom. Vote Now at http://www.sourceforge.net/community/cca08
_______________________________________________
Cdk-devel mailing list
Cdk-devel@...
https://lists.sourceforge.net/lists/listinfo/cdk-devel

Re: [Cdk-commits] SF.net SVN: cdk: [11547] cdk/branches/drzz-jumbo5.4

by Daniel Zaharevitz :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


On Jul 12, 2008, at 4:15 AM, Egon Willighagen wrote:

Hi Dan,

On Fri, Jul 11, 2008 at 4:42 PM,  <drzz@...> wrote:>
@@ -133,7 +133,7 @@
                       CMLAtomArray ar=new CMLAtomArray();
                       for (int i=0; i<chargeGroup.getAtomCount();i++){
                               CMLAtom cmlAtom=new CMLAtom();
-                               cmlAtom.setRef(chargeGroup.getAtom(i).getID());
+                               cmlAtom.setId(chargeGroup.getAtom(i).getID());

                               //Append switching atom?
                               if (chargeGroup.getAtom(i).equals(chargeGroup.getSwitchingAtom())) {

This change should have better been submitted as separate change, as
it actually changes the way to code behaves... I assume it fixes an
incorrect use of CML, which is fine of course, but does deserve at
least a commit message, but really a unit test too, to ensure that the
same mistake is not made again...

OK, maybe I'll just add some comments and commit those changes. I'm not sure what unit test would add anything. The reason the changes were made is that 5.4 throws an exception when trying to add an atom with no id. That was already caught by the present unit test. I'll have to think about other things that may need new unit tests to pick up.

Interestingly, the use of @ref might be correct in certain CML uses,
where later atoms are actually refering to earlier definitions... if
not mistaken, this might be used for certain reactions...

Yes, this gets into what I mentioned in the other reply about CML issues to work on. The code is for interfacing to a molecular dynamics program (unnamed, see http://wiki.cubic.uni-koeln.de/cdkwiki/doku.php?id=mdmolecule). The current code is certainly internally consistent (that's exactly what the unit test checks), but is it consistent with broader CML usage? The main goals are to code residue and chargeGroup membership. It uses submolecules for this:

<molecule>
  <atomArray>
    <atom id="a1">
     ...
   </atomArray>
   
   <molecule distRef="md:residue"  id="r1" titel="Rname">
     <atomArray>
       <atom ref="a1">
    </atomArray>
   </molecule>

</molecule>

I seem to recall that there were more direct ways to code this in CML, but I need to check with the Cambridge people. There's also the question of exactly what the difference in usage between ref and id and how that applies to the present situation. Is an atom in the submolecule a new distinct atom which uses a ref to connect to the "real" atom it refers to or is it a "copy" of the "real atom" which can be indicated by using the same id? I don't think these questions are really Jumbo 5.4 specific and I would really like to get Jumbo 5.4 in and then start working on these issues rather than try to take care of it all as part of the 5.4 transition. As I type this, it occurs to me the minimal change which should keep compatibility with any use of this code would be to write both an id and a ref, so in the fragment above in the residue molecule:

  <atom ref="a1" id ="r1_a1">

I think this should get us to a workable state.




Maybe the best way to resolve this 'issue' is to see who commited that
setRef() line, and ask why setRef() was used there...

You can use 'svn annotate' to see who last changed that line...


It looks like the previous code was added by you in Revision 7971 in Feb 2007.  Do you know what the MD program is and is anyone still using it?

DanZ


/********************************************
   Daniel Zaharevitz
   Chief, Information Technology Branch
   National Cancer Institute
********************************************/




-------------------------------------------------------------------------
Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW!
Studies have shown that voting for your favorite open source project,
along with a healthy diet, reduces your potential for chronic lameness
and boredom. Vote Now at http://www.sourceforge.net/community/cca08
_______________________________________________
Cdk-devel mailing list
Cdk-devel@...
https://lists.sourceforge.net/lists/listinfo/cdk-devel

Re: [Cdk-commits] SF.net SVN: cdk: [11547] cdk/branches/drzz-jumbo5.4

by Egon Willighagen-5 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Mon, Jul 14, 2008 at 1:42 PM, Daniel Zaharevitz
<zaharevitz@...> wrote:
> It looks like the previous code was added by you in Revision 7971 in Feb
> 2007.  Do you know what the MD program is and is anyone still using it?
> DanZ

Yes, by the Bioclipse people...

OK, then I understand the use. @ref is used because it refers to the
same atoms, just with different coordinates... this templating is
something Peter suggested previously for these kind of multi
conformation things...

Let's take this to cml-discuss...

Egon

--
----
http://chem-bla-ics.blogspot.com/

-------------------------------------------------------------------------
Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW!
Studies have shown that voting for your favorite open source project,
along with a healthy diet, reduces your potential for chronic lameness
and boredom. Vote Now at http://www.sourceforge.net/community/cca08
_______________________________________________
Cdk-devel mailing list
Cdk-devel@...
https://lists.sourceforge.net/lists/listinfo/cdk-devel

Re: [Cdk-commits] SF.net SVN: cdk: [11547] cdk/branches/drzz-jumbo5.4

by Daniel Zaharevitz-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


On Jul 14, 2008, at 7:42 AM, Daniel Zaharevitz wrote:

 As I type this, it occurs to me the minimal change which should keep compatibility with any use of this code would be to write both an id and a ref, so in the fragment above in the residue molecule:

  <atom ref="a1" id ="r1_a1">

I think this should get us to a workable state.



That change was made and committed to the branch. In further testing I found another problem, but it is unrelated to the Jumbo 5.4 transition. Any CML read that finds an atom without an explicit hydrogen count sets the count to zero. In CMLCoreModule

   if (atomCounter > hCounts.size()) {
       /* while strictly undefined, assume zero 
       implicit hydrogens when no number is given */
       hCounts.add("0");
   }

This has been in here for years. I'm not sure what changed in the atomType code in the last few months, but earlier in the year this didn't cause the atom type matching to fail and now it does. I think the failure is correct, because it does check to see if HydrogenCount != CDLConstants.UNSET. Obviously the read code should set HydrogenCount to UNSET rather than 0. I'm a little unsure of myself in that code, so it might take a while for me do fix that. I'm pretty sure this has nothing to do with Jumbo 5.4, but I'll know more tomorrow when I finish rerunning all the structure load stuff.

DanZ


/********************************************
   Daniel Zaharevitz
   Chief, Information Technology Branch
   National Cancer Institute
********************************************/




-------------------------------------------------------------------------
Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW!
Studies have shown that voting for your favorite open source project,
along with a healthy diet, reduces your potential for chronic lameness
and boredom. Vote Now at http://www.sourceforge.net/community/cca08
_______________________________________________
Cdk-devel mailing list
Cdk-devel@...
https://lists.sourceforge.net/lists/listinfo/cdk-devel

Re: [Cdk-commits] SF.net SVN: cdk: [11547] cdk/branches/drzz-jumbo5.4

by peter murray-rust :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Daniel Zaharevitz wrote:

This is a difficult legacy which we are now addressing. There is no
simple way of registering that an int is undefined and there is little
communal understanding of what is meant by "undefined" in chemistry. In
our latest schema we have decided that all hydrogen atoms must be
explicit all the time. So it is the responsibiluty of the agent who
ingests a molecule to make sure that it is clear what the hydrogen count
is on every atom.  It is much cheaper to include the H atoms (which may
or may not have coordinates) than to worry about trying to sort it out
afterwards.

P.

>
> On Jul 14, 2008, at 7:42 AM, Daniel Zaharevitz wrote:
>
>>  As I type this, it occurs to me the minimal change which should keep
>> compatibility with any use of this code would be to write both an id
>> and a ref, so in the fragment above in the residue molecule:
>>
>>   <atom ref="a1" id ="r1_a1">
>>
>> I think this should get us to a workable state.
>>
>>
>
> That change was made and committed to the branch. In further testing I
> found another problem, but it is unrelated to the Jumbo 5.4
> transition. Any CML read that finds an atom without an explicit
> hydrogen count sets the count to zero. In CMLCoreModule
>
>    *if* (atomCounter > hCounts.size()) {
>        /* while strictly undefined, assume zero
>        implicit hydrogens when no number is given */
>        hCounts.add("0");
>    }
>
> This has been in here for years. I'm not sure what changed in the
> atomType code in the last few months, but earlier in the year this
> didn't cause the atom type matching to fail and now it does. I think
> the failure is correct, because it does check to see if HydrogenCount
> != CDLConstants.UNSET. Obviously the read code should set
> HydrogenCount to UNSET rather than 0. I'm a little unsure of myself in
> that code, so it might take a while for me do fix that. I'm pretty
> sure this has nothing to do with Jumbo 5.4, but I'll know more
> tomorrow when I finish rerunning all the structure load stuff.
>
> DanZ
>
>
> /********************************************
>    Daniel Zaharevitz
>    Chief, Information Technology Branch
>    National Cancer Institute
>    zaharevitz@... <mailto:zaharevitz@...>
> ********************************************/
>
>
>
> ------------------------------------------------------------------------
>
> -------------------------------------------------------------------------
> Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW!
> Studies have shown that voting for your favorite open source project,
> along with a healthy diet, reduces your potential for chronic lameness
> and boredom. Vote Now at http://www.sourceforge.net/community/cca08
> ------------------------------------------------------------------------
>
> _______________________________________________
> Cdk-devel mailing list
> Cdk-devel@...
> https://lists.sourceforge.net/lists/listinfo/cdk-devel
>  


-------------------------------------------------------------------------
Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW!
Studies have shown that voting for your favorite open source project,
along with a healthy diet, reduces your potential for chronic lameness
and boredom. Vote Now at http://www.sourceforge.net/community/cca08
_______________________________________________
Cdk-devel mailing list
Cdk-devel@...
https://lists.sourceforge.net/lists/listinfo/cdk-devel
LightInTheBox - Buy quality products at wholesale price