|
View:
New views
2 Messages
—
Rating Filter:
Alert me
|
|
|
MySQL Backend - NCI Thesaurus (Truncation, data too long)
Dear all,
I'm trying to upload the NCI Thesaurus (2008 OWL version, 89 MBytes) into Protege using a MySQL Backend. I'm using the API, programmatically (not the GUI). With other ontologies (pizza.owl, for example), everything is fine. For NCI, however, it seems to only partially work. After much processing, I get this error: ....
com.mysql.jdbc.MysqlDataTruncation: Data truncation: Data too long
for column 'frame' at row 1 The problem seems to be that the table's definition has automatically been defined to be 255 characters long for attribute 'Frame'. It seems that this is not enough for NCI Thesaurus 2008. For some tuples it needs to be at least 345 characters long. Can I change the table definition from the API? If I change it manually (MySQL admin) and re-run my program, it redefines the field's size to 255. I'm using Protege 3.4 Beta (build 504), and MySQL 5.0.51b community edition. Many thanks for any pointers. Regards, Jesus PS: the program recovers from this error and continues, and I eventually get an out of memory error (even though I'm using 512M heap size), but that could be related to the previous error... so let's take one at a time. -- ______________________________________________________________________ Jesus Bisbal-Riera http://www.tecn.upf.es/~jbisbal Currently: Visiting Researcher at Dublin Institute of Technology Kevin Street Dublin 8, Ireland Ph: +35314024929 Department of Information and Communication Technologies Universitat Pompeu Fabra http://www.upf.edu Passeig de Circumval·lació, 8 08003 Barcelona Work Ph: +34 93 542 29 51 / 25 00 Spain Fax: +34 93 542 25 17 _______________________________________________ protege-owl mailing list protege-owl@... https://mailman.stanford.edu/mailman/listinfo/protege-owl Instructions for unsubscribing: http://protege.stanford.edu/doc/faq.html#01a.03 |
|
|
Re: MySQL Backend - NCI Thesaurus (Truncation, data too long)Hi Jesus,
Indeed the new thesaurus has an URI that is over 300 chars long. You can easily change the column definition used by Protege. Just edit the protege.properties and add this line: Database.typename.varchar.com.mysql.jdbc.Driver=VARCHAR(500) COLLATE UTF8_BIN Instead of 500, you can use also a bigger number (if needed). If you are using the non-streaming parser, then you need to allocate a heap size around 800M to Protege. It first reads the ontology in memory and then it writes it out to the database. If you use the streaming parser, the memory requirements are smaller, but I did not measure exactly how much you need. We have a wiki page that describes how to convert programmatically an owl file to a database (both streaming and non-streaming approach): http://protegewiki.stanford.edu/index.php/ConvertingToDatabaseProject Tania Jesus Bisbal Riera wrote: > Dear all, > I'm trying to upload the NCI Thesaurus (2008 OWL version, 89 > MBytes) into Protege using a MySQL Backend. I'm using the API, > programmatically (not the GUI). With other ontologies (pizza.owl, for > example), everything is fine. > For NCI, however, it seems to only partially work. After much > processing, I get this error: > > .... > Loaded 1110000 triples 9280 > > com.mysql.jdbc.MysqlDataTruncation: Data truncation: Data too long > for column 'frame' at row 1 > > The database has over a million tuples into it. > The problem seems to be that the table's definition has > automatically been defined to be 255 characters long for attribute > 'Frame'. It seems that this is not enough for NCI Thesaurus 2008. For > some tuples it needs to be at least 345 characters long. > Can I change the table definition from the API? If I change it > manually (MySQL admin) and re-run my program, it redefines the field's > size to 255. > I'm using Protege 3.4 Beta (build 504), and MySQL 5.0.51b > community edition. > Many thanks for any pointers. > Regards, > > Jesus > > PS: the program recovers from this error and continues, and I > eventually get an out of memory error (even though I'm using 512M heap > size), but that could be related to the previous error... so let's > take one at a time. > -- > ______________________________________________________________________ > Jesus Bisbal-Riera http://www.tecn.upf.es/~jbisbal > Currently: Visiting Researcher at Dublin Institute of Technology > Kevin Street > Dublin 8, Ireland > Ph: +35314024929 > > Department of Information and Communication Technologies > Universitat Pompeu Fabra http://www.upf.edu > Passeig de Circumval·lació, 8 > 08003 Barcelona Work Ph: +34 93 542 29 51 / 25 00 > Spain Fax: +34 93 542 25 17 > ------------------------------------------------------------------------ > > _______________________________________________ > protege-owl mailing list > protege-owl@... > https://mailman.stanford.edu/mailman/listinfo/protege-owl > > Instructions for unsubscribing: http://protege.stanford.edu/doc/faq.html#01a.03 > _______________________________________________ protege-owl mailing list protege-owl@... https://mailman.stanford.edu/mailman/listinfo/protege-owl Instructions for unsubscribing: http://protege.stanford.edu/doc/faq.html#01a.03 |
| Free Forum Powered by Nabble | Forum Help |