Re: Unicode Character Allocation
- From: Maxim Demenko <mdemenko@xxxxxxxxx>
- Date: Thu, 13 Apr 2006 21:20:41 +0200
my_grillz_gleam schrieb:
Hello all,
I have a quick question regarding how Oracle allocates storage space
for its data types. In particular, I have been tasked develop processes
to move data between Oracle and DB2 databases which both are set to
use UTF-8. Now, I have no problems moving data from the DB2 tables to
the Oracle tables, however moving from Oracle to DB2 has been causing
records to reject. And to note, both tables have the exact same DDL and
the Oracle is using BYTE semantics (DB2 only has BYTE semantics). Now
my question is:
Does Oracle, in UTF-8 mode, actually allocate 4 bytes per every byte
specified in the DDL for a character field?
i.e. does VARCHAR2(100 BYTE) equal 400 bytes or 100 bytes of disk
space allocated? It seems to me that this is the case, from my testing.
And unfortunately my Oracle DBA was not able to confirm this.
You may look it in the Oracle online documentation:
http://download-uk.oracle.com/docs/cd/B19306_01/server.102/b14225/ch6unicode.htm#g1014017
<quote>
UTF-8 is the 8-bit encoding of Unicode. It is a variable-width encoding and a strict superset of ASCII. This means that each and every character in the ASCII character set is available in UTF-8 with the same code point values. One Unicode character can be 1 byte, 2 bytes, 3 bytes, or 4 bytes in UTF-8 encoding. Characters from the European scripts are represented in either 1 or 2 bytes. Characters from most Asian scripts are represented in 3 bytes. Supplementary characters are represented in 4 bytes.
</quote>
In other words, it depends on the characters in UTF-8, how many bytes will them represent, it may vary from 100 bytes up to 400 bytes for 100 characters.
Best regards
Maxim
.
- Follow-Ups:
- Re: Unicode Character Allocation
- From: my_grillz_gleam
- Re: Unicode Character Allocation
- References:
- Unicode Character Allocation
- From: my_grillz_gleam
- Unicode Character Allocation
- Prev by Date: Re: UPDATE SET of multiple columns
- Next by Date: Create a new OMF tablespace with OEM
- Previous by thread: Unicode Character Allocation
- Next by thread: Re: Unicode Character Allocation
- Index(es):
Relevant Pages
|