Unicode support

A place to discuss development of the Xataface core.

Postby Pedja » Wed Dec 27, 2006 3:52 pm

Unicode suppoert is a must. Dataface is unusable for languages that are not using plain ASCII.

I managed to force Dataface to dusplay unicode data correctly by these changes:

In Dataface_Main_Template.html i altered code to look like this:

Code: Select all
*}{if !$ENV.APPLICATION_OBJECT->main_content_only}



   

   {define_slot name="html_head"}
      
      {define_slot name="html_title"}{if $ENV.record}{$ENV.record->getTitle()} - {else}{$ENV.table} - {/if}{if $ENV.APPLICATION.title}{$ENV.APPLICATION.title}{else}Dataface Application{/if}{/define_slot}
[/cide]

I actualy forced UTF-8 encoding in document.

In Application.php I altered code to look liek this:

[code]
         mysql_select_db( $dbinfo['name'] ) or die("Could not select DB: ".mysql_error($this->_db));
      }
      if ( !defined( 'DATAFACE_DB_HANDLE') ) define('DATAFACE_DB_HANDLE', $this->_db);
      //$res = mysql_query("show tables", $this->_db);
      //if ( !$res ){
      //   trigger_error(mysql_error($this->_db), E_USER_ERROR);
      //}
      //$this->tableIndex = array();
      //while ( $row = mysql_fetch_row($res) ){
      //   $this->tableIndex[$row[0]] = 1;
      //}
      //mysql_free_result($res);
      

note ----->    mysql_query("SET NAMES 'utf8'", $this->_db);
note ----->    mysql_query("SET CHARACTER SET utf8", $this->_db);
note ----->    mysql_query("SET COLLATION_CONNECTION = 'utf8_global_ci'", $this->_db);


      
      if ( !is_array( $conf['_tables'] ) ){
         echo "

            Error reading table information from the config file.  Please enter the table information in its own section
            of the ini file as follows:
            [_tables]
            table1 = Table 1 Label
            table2 = Table 2 Label
            
";
         exit;
      }


I marked three lines of code I inserted. Those set MySQL connection in proper unicode state.

After these changes, Dataface shows table data properly. But it is not all, it still does not support UTF8 in full. Editing still does not work (invalid code is inserted in records). I also guess that any advanced string function would also make a problem when used on unicode string values.
Pedja
 
Posts: 11
Joined: Wed Dec 31, 1969 5:00 pm

Postby shannah » Mon Jan 01, 2007 4:28 pm

As of Dataface 0.6, unicode is fully supported. All you have to do is add the following to the beginning of the conf.ini file:

oe = UTF-8
ie = UTF-8

Best regards

Steve
--
Steve Hannah
@shannah78 (on twitter)
sjhannah.com blog
shannah
 
Posts: 4457
Joined: Wed Dec 31, 1969 5:00 pm

Postby Pedja » Tue Jan 02, 2007 2:54 am

Before posted, I checked all docs and forum ang found no clue on this. Should I put these parameters in [_database} section?
Pedja
 
Posts: 11
Joined: Wed Dec 31, 1969 5:00 pm

Postby Pedja » Tue Jan 02, 2007 2:55 am

Have you noted this:

mysql_query("SET NAMES 'utf8'", $this->_db);
mysql_query("SET CHARACTER SET utf8", $this->_db);
mysql_query("SET COLLATION_CONNECTION = 'utf8_global_ci'", $this->_db);

This is a must for UTF to work regularly, at least for cyrilic characters it is.
Pedja
 
Posts: 11
Joined: Wed Dec 31, 1969 5:00 pm

Postby Pedja » Tue Jan 02, 2007 3:04 am

I've set this and it indeed changed the way Dataface behave. Id displays contents properly, but editing stil does not work as expected.

For instance when I enter 'stre?er' into the filed it is written and displayed properly but when I want to edit it shows as 'streчer', which is better than 'stre?er' I got before.
Pedja
 
Posts: 11
Joined: Wed Dec 31, 1969 5:00 pm

Postby Pedja » Tue Jan 02, 2007 3:09 am

Hmm, this forum interprets encoding so it shows problemtaic character as character instead of code :)

I will try again: word is encode as 's t r e & # 2 6 9 ; e r' (without spaces. It is written in databse in the same manner.
Pedja
 
Posts: 11
Joined: Wed Dec 31, 1969 5:00 pm

Postby Pedja » Tue Jan 02, 2007 3:10 am

Can we have an option to edit our own posts?
Pedja
 
Posts: 11
Joined: Wed Dec 31, 1969 5:00 pm

Postby shannah » Tue Jan 02, 2007 11:47 am

Before posted, I checked all docs and forum ang found no clue on this. Should I put these parameters in [_database} section?



No. Place them at the beginning of the conf.ini file - in no section at all:

e.g.

Code: Select all
ie=UTF-8
oe=UTF-8

[_database]
...

[_tables]
...
--
Steve Hannah
@shannah78 (on twitter)
sjhannah.com blog
shannah
 
Posts: 4457
Joined: Wed Dec 31, 1969 5:00 pm

Postby maxmokeyev » Thu Feb 15, 2007 4:59 am

So, I've put the IE and OE parameters in the .ini file, but I still see "????" instead of my Cyrillic characters. Are there other modifications that need to be made? (When I access the database from the ODBC i use "set names cp1251".
maxmokeyev
 
Posts: 9
Joined: Wed Dec 31, 1969 5:00 pm

Postby shannah » Thu Feb 15, 2007 1:24 pm

I believe ie and oe are case sensitive.Ê Just want to make sure you did ie=UTF-8 and oe=UTF-8 and NOT IE=UTF-8 and OE=UTF-8.

Let me know if this is not the case and I'll look into it further.

-Steve

--
Steve Hannah
@shannah78 (on twitter)
sjhannah.com blog
shannah
 
Posts: 4457
Joined: Wed Dec 31, 1969 5:00 pm

Postby maxmokeyev » Fri Feb 16, 2007 2:52 am

It does say:
oe=UTF-8
ie=UTF-8

When I do "Page Info" in FireFox on my app it says "Content-Type text/ttml; charset=ISO-8859-1
maxmokeyev
 
Posts: 9
Joined: Wed Dec 31, 1969 5:00 pm

Postby shannah » Fri Feb 16, 2007 5:38 pm

Can you check the html source that is produced by your app?

It should have a tag near the top like:

It should say charset=UTF-8 if it is picking up the oe and ie parameters.

If it does not, then you may have the oe and ie parameters in the wrong place.
Make sure they are at the beginning of your conf.ini file and not the end or middle.
If it does say UTF-8 then this is extremely strange that your browser is not picking it up...
Let me know how it goes.
-Steve
--
Steve Hannah
@shannah78 (on twitter)
sjhannah.com blog
shannah
 
Posts: 4457
Joined: Wed Dec 31, 1969 5:00 pm

Postby maxmokeyev » Mon Feb 19, 2007 9:35 am

it does say:
Very strange.
I have the:
oe=UTF-8
ie=UTF-8

as the first two lines of the conf.ini file.
maxmokeyev
 
Posts: 9
Joined: Wed Dec 31, 1969 5:00 pm

Postby shannah » Mon Feb 19, 2007 12:15 pm

In reviewing your example above, you say that it works for you when you run 'set names cp1251'. This means that you are not using unicode in that example, you are using Cyrillic text encoding ( a windows encoding).
Unicode (UTF-8) is a different encoding that is also compatible with Cyrillic text. I use unicode because it will work with virtually all known languages and charsets.

Perhaps your browser is still not picking up Unicode properly. Try forcing your browser's text encoding to UTF-8 (try a few encodings to see what happens).

If all else fails, you can also try changing the character set of your mysql database/tables/fields to unicode (I presume they are using cp1251 right now) - but this should not be necessary.

Upon searching Google it also looks like MySQL versions older than 4.1 don't fully support unicode. Which version of MySQL are you using? This could also be the problem.

I would prefer to keep all apps in either ISO-Latin-1 or UTF-8, but It may be possible to make some small modifications to allow other character sets (like cp1251).

-Steve
--
Steve Hannah
@shannah78 (on twitter)
sjhannah.com blog
shannah
 
Posts: 4457
Joined: Wed Dec 31, 1969 5:00 pm

Postby maxmokeyev » Mon Feb 19, 2007 1:03 pm

I guess you are right, and the problem is with the way the data is stored. It was originally imported from MS Access, and is currently being populated through an MS Access app.

Thanks for your help.
maxmokeyev
 
Posts: 9
Joined: Wed Dec 31, 1969 5:00 pm


Return to Xataface Developers

Who is online

Users browsing this forum: No registered users and 15 guests

Powered by Dataface
© 2005-2007 Steve Hannah All rights reserved