SiComponents Home Page SiComponents Forums
Here you will be able to get help and share your experience
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

New Version Need with Unicode storage in TsiLang
Goto page Previous  1, 2
 
Post new topic   Reply to topic    SiComponents Forums Forum Index -> TsiLang Components Suite
View previous topic :: View next topic  
Author Message
DInfo



Joined: 02 Mar 2005
Posts: 24

PostPosted: Tue Oct 24, 2006 1:54 am    Post subject: Reply with quote

Igor,

Thanks so much for your comments.

After we changed to pure unicode storage - making no changes to the ElPack, TNT, or std Delphi controls- the strings display correctly with no garbarge chars. So, (please forgive me for a bug report) this is an issue with TsiLang or Borland as described below.

We did have to make a change to TypInfo to eliminate a rather senseless conversion from Unicode to Ansi to Unicode. Was this the source of our original problem? Your code calls this function in TypInfo, so maybe this is the root of the problem. Regardless of which code caused the issue, the issue of a faulty Ansi to Uni conversion was at the root. Following is our code change. In essence, Borland was goofy here. They are implicitly converting to an AnsiString with the SetStrProp(Instance, FindPropInfo(Instance, PropName), Value); call. The code supports no reason that this would have been done, so I'm a bit baffled. Our fix follows:


procedure SetWideStrProp(Instance: TObject; const PropName: string; const Value: WideString);
var
PropInfo: PPropInfo;
begin
//Original Call
// This call implicitly converts Value into an AnsiString that loses Widestring chars
//SetStrProp(Instance, FindPropInfo(Instance, PropName), Value);

//Instead, directly call Widestring routine to preserve widestring integrity
PropInfo := FindPropInfo(Instance, PropName);
if (PropInfo^.PropType^.Kind = tkWString) then
SetWideStrProp(Instance, PropInfo, Value)
else
SetStrProp(Instance, PropInfo, Value);
end;

Is is possible that this Borland code creates a TsiLang Ansi to Wide to Ansi to Wide conversion that creates code page issues for TsiLang?

Would you be interested in incorporating our unicode enabled storage into TsiLang?

I help this helps.

Best,

David
Back to top
View user's profile Send private message
isiticov
Site Admin


Joined: 21 Nov 2002
Posts: 2097

PostPosted: Tue Oct 24, 2006 12:42 pm    Post subject: Reply with quote

Hi David,

Thank you very much for your information. Could you please let me know the following: did you use SetWideStrProp() internally in your code and this is why there were problems with Unicode? Becuase TsiLang uses only SetWideStrProp(Instance: TObject; PropInfo: PPropInfo; const Value: WideString); implementation.
Also what version of Delphi do you use? Because I wasn't able to find your sample code in Delphi 6+ sources.
Yes, I would like take a look at your Unicode storage but I can't promise to integrate it into TsiLang, because new version is almost ready and finalized.
Back to top
View user's profile Send private message
DInfo



Joined: 02 Mar 2005
Posts: 24

PostPosted: Fri Oct 27, 2006 8:03 pm    Post subject: Reply with quote

Igor,

We are calling SetWideStrProp just like TsiLang calls it. In fact, this is how we found the Delphi 6 bug mentioned. We are using Delphi 6. The function that I pasted is the function after we fixed it. Look for for SetWideStrProp in TypInfo.pas. Trace it and you will be able to see the implicit widestring to ansistring conversion when it calls SetStrProp. SetStrProp *only* takes an AnsiString so the Delphi complier implictly converts a widestring to an ansistring. From SetStrProp Delphi tests the tkType and sends tkWidestring on to a function that handles widestrings. However, widestring info has been truncated as the string comes into SetStrProp, so sending it to a widestring function does not restore truncated characters.

Since TsiLang is calling SetWideStrProp, this bug affects you too.

I am on a super deadline now, but I would be glad to pass on our Unicode handling code for your next version.

Best,

David
Back to top
View user's profile Send private message
isiticov
Site Admin


Joined: 21 Nov 2002
Posts: 2097

PostPosted: Sat Oct 28, 2006 4:45 am    Post subject: Reply with quote

David,

Thank you for details, but I'm now absolutely confused. What version of TsiLang Components Suite do you use?
Because in 6.0.2 there is explicit call to
Code:
procedure SetWideStrProp(Instance: TObject; PropInfo: PPropInfo;
  const Value: WideString); overload;

which works correctly with WideStrings. TsiLang uses the
Code:
procedure TsiCustomLang.siSetStrProp(const AObject: TObject; const PInfo: PPropInfo; const PropValue: string);

function to change all strings properties and this function uses
Code:
SetWideStrProp(AObject, PInfo, AnsiStringToWideStringCP(PropValue, CurrentCharset));

when passed property is WideString.
Do you have the same on your side?
Back to top
View user's profile Send private message
DInfo



Joined: 02 Mar 2005
Posts: 24

PostPosted: Sat Oct 28, 2006 7:59 am    Post subject: Reply with quote

Igor,

We are using TsiLang 6.0.2 as well. You example is exactly correct with this exception:

When TsiCustomLang.siSetStrProp makes the following call

SetWideStrProp(AObject, PInfo, AnsiStringToWideStringCP(PropValue, CurrentCharset));

then you are converting your AnsiString from storage to a Widestring using the current CP.

So far, so good (as long as the CP is correct).

Here's the error:

In TypInfo.SetWideStrProp, the Delphi code passing the widestring Value to TypInfo.SetStrProp as follows:

SetStrProp(Instance, FindPropInfo(Instance, PropName), Value);

However, the declaration of SetStrProp is as follows (from memory)

procedure SetStrProp(Instance: TObject; PropInfo: PPropInfo;
const Value: String);

in which case the Value is an AnsiString.

So, Delphi does an implicit conversion that converts the Value param from a Widestring to an AnsiString.

Then SetStrProp tests the tkType and if the type return from the control is tkWidestring, then it passes Value (now an Ansistring) to another function that take a Widestring as a param. In this process the AnsiString is once again converted to a widestring.

So starting with TsiLang: the Ansistring from storage is converted to a widestring, back to an Ansistring, and then back to a widestring again.

Since your storage mechanism depends on Ansistrings as the foundation, most strings go through these conversions OK. However, even some Ansistrings will have trouble with these conversions if the character being mapped by Windows is not an exact CP mapping. For example, the quote character in Times Roman can be in the ANSI range 1-128 or it can be a unicode value of 4000 or so. During the mapping process, the character can be mapped to its equivalent (depending on the WideStringToAnsiString options). Then when it is "re-mapped" back to Unicode, it doesn't get retranslated into the proper character.

However, my issue with the whole Ansi to Unicode mapping goes much deeper. In general, I need to fully enable an application for on the fly switching of languages. This means that in some cases, that on form or one Windows common dialog needs to display characters from more than one charset. This is a requirement forced on us by implementing on-the-fly language switching on a Windows OS that is designed to fit only a few localizations. If I stick to AnsiStrings, having characters from more than one charset is problematic at best. At best, multiple charsets requirements have to be tracked and charsets have to be switched correctly for the translations over and above what TsiLang makes possible. Anyway, especially when we are not in charge of the thread locale (like in a ActiveDocument), this is problematic.

If you have hung on thus far ...

With all storage as unicode, our problems with character re-mapping and multiple charsets completely go away. But evidently Delphi designers goofed a bit in the SetWideStrProp call because as written, it requires our unicode strings to be converted to Ansistrings and back to unicode. This conversion complete obliterates Russian and Asian text that is pushed through a Default Charset (Windows common dialogs, etc.).

This is a very confusing problem. However, I am happy to report that after about 10 days of hacking, we now have an application that do the following:

if Language support is installed for the language group (CONTROL PANEL, REGIONAL SETTINGS), then we can use any language on any version of Windows. For example, we can do Russian on Chinese OS, Japanese on Korean, Chinese on English, etc. including the Common Dialogs and it works!

So, after I sleep a month, I will pass along code that might be helpful. In the meantime, I hope that I've made a better case for going to straight unicode for your storage. Doing so will give you many more options in languages and how they are supported.

Best,

David
Back to top
View user's profile Send private message
isiticov
Site Admin


Joined: 21 Nov 2002
Posts: 2097

PostPosted: Sat Oct 28, 2006 9:23 am    Post subject: Reply with quote

David,
Our call
SetWideStrProp(AObject, PInfo, AnsiStringToWideStringCP(PropValue, CurrentCharset));
goes directly to
procedure SetWideStrProp(Instance: TObject; PropInfo: PPropInfo;
const Value: WideString); overload;

which doesn't call SetStrProp() for Unicode strings. And as result there is no conversion (Unicode loss) from wide to ansi.
Back to top
View user's profile Send private message
DInfo



Joined: 02 Mar 2005
Posts: 24

PostPosted: Mon Oct 30, 2006 3:10 pm    Post subject: Reply with quote

Igor,

My bad ... of course you are correct on the TsiLang call.

It is interesting why the TsiLang conversions truncated the Unicode strings. How could this happen? I don't know. I do know that staying in the Unicode world has worked miracles for our language compatibility.

Here is a huge advantage for using a Unicode storage with no code page translations:

I am currently debugging a 13 language application using just US English XP. This is massive timesaver. Previously, I had to spent a lot of time in the localized OSes to test: Chinese, Japanese, etc. We will of course do a base "sanity check" installation on those OSes. However, 99% of our testing and debuging can be done in our native OS.

Woo hoo.

Thanks,

David
Back to top
View user's profile Send private message
DInfo



Joined: 02 Mar 2005
Posts: 24

PostPosted: Mon Oct 30, 2006 3:15 pm    Post subject: Reply with quote

Igor,

One more follow up to your last message:

When Tsilang converts from the Unicode translation that is created in the native language (stored in the Dictionary) and then saved to AnsiString storage and then back to Unicode for display, it *is* subject to the same character re-mapping issues as I discussed above. This is a well-known issue with the Windows multi-byte conversion routines.

Food for thought ...

David
Back to top
View user's profile Send private message
DInfo



Joined: 02 Mar 2005
Posts: 24

PostPosted: Sun Nov 05, 2006 3:55 pm    Post subject: Reply with quote

Igor,

Best wishes on your new release. Thanks for engaging me on this whole unicode issue.

David
Back to top
View user's profile Send private message
isiticov
Site Admin


Joined: 21 Nov 2002
Posts: 2097

PostPosted: Sun Nov 05, 2006 6:11 pm    Post subject: Reply with quote

Hi David,

Thank you! Hope, the next update will include Unicode enabled storage Wink
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    SiComponents Forums Forum Index -> TsiLang Components Suite All times are GMT
Goto page Previous  1, 2
Page 2 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by p h p B B  © 2001, 2005 p h p B B  Group