[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Unicode in Filenames



Some users may experience problems with filenames that include non-ASCII
unicode characters, i.e.

   Les mouvements de r?forme.doc
   l'?nonc?.sxw
   Rammstein - B?ck Dich.mp3
   megaherz - Glas Und Tr?nen.mp3
   Megaherz - Mistst?ck.MP3

It turns out that in Python 2.2 (which Cedar Backup uses by default),
all of the filesystem functions deal with paths using a specific
encoding (codepage) value taken from a system-wide setting.  By default,
this encoding value is 'ascii', which (not surprisingly) causes problems
for filenames like the ones above.

The solution is to create a Python site-customization file containing
your preferred encoding.  On a Debian system, the file should be called:

   /usr/lib/python2.2/site-packages/sitecustomize.py

The file might be located in a different place on a non-Debian system.

In sitecustomize.py, place two lines:

   import sys
   sys.setdefaultencoding('iso-8859-1')

The iso-8859-1 codepage should work for most European languages.  
For other languages (in particular Eastern languages), you may need
to choose a different codepage.

Eventually, when I can move Cedar Backup to Python 2.3, this setting
will no longer be required, as Python 2.3 deals with Unicode filepaths
much more seamlessly than does Python 2.2.

KEN

-- 
Kenneth J. Pronovici <kenneth.pronovici@cedar-solutions.com>
Cedar Solutions Software 
http://www.cedar-solutions.com/


--
To unsubscribe, send mail to cedar-backup-users-unsubscribe@cedar-solutions.com.