|
View:
New views
4 Messages
—
Rating Filter:
Alert me
|
|
|
Literals, Introducers, Character Set NONE, suggestionsHello everybody!
Sorry for long post, question at bottom. I have a question regarding non-ascii characters, client connection sets, and introducers. My application uses Firebird Embedded, connecting to the database using CHARACTER SET NONE. I create a test database, with default character set encoding of ISO8859_1. I'm testing Firebird 2.5. When my client connects to the database, it is possible for the user to enter SQL syntax like this, with non-ASCII characters. create table "ICU" ("BIN" int, "LINGSUB" int, "BINSUB" int, "NAME" character(20)); insert into "ICU" values (2, 4, 5, 'Adélaïde'); The problem here is that since the literal should really be specified with an introducer, _UNICODE_FSS or _ISO8859_1. When I fetch this record, I find the data is corrupted. We originally fixed this issue in SAS Vulcan by having our driver rewrite the query and add "_UNICODE_FSS" to all literals. This was kind of yuck, since if the user had an error in the SQL, the error message would come back with the _UNICODE_FSS token in the parsing error. Then, I fixed the issue in SAS Vulcan directly in dsql/pass1, routine pass1_constant(). I added code that forced _UNICODE_FSS on all literals, unless the user supplied an introducer. // force UNICODE_FSS, unless user has supplied an introducer. if (!string->str_charset) string->str_charset = "UNICODE_FSS"; This worked great for me, but doesn't seem to be a fix that can go back to HEAD. Too big of a hammer. ;-) HEAD version of pass1_constant() now appears to have code that determines the client connection set, and sets the text type of the literal correctly. Would it be desirable to add a check in the Firebird engine to see if the client connection set is NONE, and then treat the literal as UNICODE_FSS? Perhaps there is a better way to make this check, and there may be other solutions I haven't considered here, too. My fallback is to put code back in my driver to add the introducers, but if making a change in Firebird itself is helpful for others, that would help me too. Thanks in advance! -bill ------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel |
|
|
Re: Literals, Introducers, Character Set NONE, suggestionsBill Oliver escreveu:
> Hello everybody! > > Sorry for long post, question at bottom. > > I have a question regarding non-ascii characters, client connection sets, > and introducers. > > My application uses Firebird Embedded, connecting to the database using > CHARACTER SET NONE. I create a test database, with default character set > encoding of ISO8859_1. > I'm testing Firebird 2.5. > > When my client connects to the database, it is possible for the user to > enter SQL syntax like this, with non-ASCII characters. > > create table "ICU" ("BIN" int, "LINGSUB" int, "BINSUB" int, "NAME" > character(20)); > insert into "ICU" values (2, 4, 5, 'Adélaïde'); > And that 'Adélaïde' is encoded in UNICODE_FSS. > The problem here is that since the literal should really be specified with > an introducer, _UNICODE_FSS or _ISO8859_1. When I fetch this record, I find > the data is corrupted. > > We originally fixed this issue in SAS Vulcan by having our driver rewrite > the query and add "_UNICODE_FSS" to all literals. This was kind of yuck, > since if the user had an error in the SQL, the error message would come back > with the _UNICODE_FSS token in the parsing error. > > Then, I fixed the issue in SAS Vulcan directly in dsql/pass1, routine > pass1_constant(). I added code that forced _UNICODE_FSS on all literals, > unless the user supplied an introducer. > > // force UNICODE_FSS, unless user has supplied an introducer. > if (!string->str_charset) > string->str_charset = "UNICODE_FSS"; > > This worked great for me, but doesn't seem to be a fix that can go back to > HEAD. Too big of a hammer. ;-) > > HEAD version of pass1_constant() now appears to have code that determines > the client connection set, and sets the text type of the literal correctly. > > Would it be desirable to add a check in the Firebird engine to see if the > client connection set is NONE, and then treat the literal as UNICODE_FSS? > > Perhaps there is a better way to make this check, and there may be other > solutions I haven't considered here, too. Yes. Why not you use UNICODE_FSS as connection character set? Adriano ------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel |
|
|
Re: Literals, Introducers, Character Set NONE,> Yes. Why not you use UNICODE_FSS as connection character set?
Thanks for the reply! The decision to connect with character set NONE, was done before my time. I assume that it was done for performance reasons, or perhaps because of the many problems with UNICODE_FSS present in pre-2.0 versions of Firebird. I'm willing to try this. Any limitations I should know about? Also, I thought that multi-byte character sets were prohibited as client connection sets, but this seems not to be true? Perhaps this is because with UNICODE_FSS, table names, etc. are still in 1-byte ascii representation? -bill ------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel |
|
|
Re: Literals, Introducers, Character Set NONE,Bill Oliver escreveu:
>> Yes. Why not you use UNICODE_FSS as connection character set? >> > > Thanks for the reply! > > The decision to connect with character set NONE, was done before my time. I > assume that it was done for performance reasons, or perhaps because of the > many problems with UNICODE_FSS present in pre-2.0 versions of Firebird. > The major problem before 2.0 is that buffer sizes aren't translated from one charset to another. If you have a CHAR(10) ISO8859-1 field and connect using UNICODE_FSS, the buffer in described has 10 bytes (instead of the necessary 30). > I'm willing to try this. Any limitations I should know about? AFAIK, no. > Also, I thought that multi-byte character sets were prohibited as client connection > sets, but this seems not to be true? It's not true. > Perhaps this is because with UNICODE_FSS, table names, etc. are still in 1-byte ascii representation? The only limitation is that you can't create object names using more than 31 bytes (instead of characters). Adriano ------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel |
| Free Forum Powered by Nabble | Forum Help |