Re: storing byte values
- From: "Vladimir M. Zakharychev" <vladimir.zakharychev@xxxxxxxxx>
- Date: Tue, 07 Aug 2007 10:52:39 -0700
On Aug 7, 7:50 pm, "Vladimir M. Zakharychev"
<vladimir.zakharyc...@xxxxxxxxx> wrote:
On Aug 7, 6:44 pm, pascal.dull...@xxxxxxxx wrote:
Thanks for your answers, but they are not very helpful... ;-)
I can't explain the data model or the business plan, because it's not
my model or plan. I need to migrate any existing database (within
certain rules, of course) and it needs to work with an existing
application.
So... is there a way to store byte values and to define a unique
constraint which includes the column with the byte values?
It would be sufficient to use a CHAR or VARCHAR column, if all
possible bytes would be returned unchanged. But even with NCHAR there
seems to be data loss, because some bytes seem to be unbound in the
used character set. For example: when writing the character 0x80 into
an NCHAR column and reading it again, the returned value is 0xBF. The
same goes for 26 other bytes. They all get converted to 0xBF.
Do I have to pay attention to something special when using NCHAR?
By the way, I'm using JDBC to connect to the database. I insert the
values as string parameters of an PreparedStatement, so they should
arrive as unicode strings in the database.
If you want to store binary data that's not subject to charset
conversions, in Oracle you use RAW or BLOB. You can have more than one
RAW column in a table (don't confuse RAW with LONG RAW, the latter has
one column per table restriction, but LONGs are obsolete since 8i and
are replaced with LOBs. They are still there for compatibility with
previous releases and massive inherited code base that still uses
them, but Oracle expressly warns in the docs that LONGs may disappear
in any future release. Maybe they already did in 11g..?)
Any character type, be it VARCHAR2 or NVARCHAR2, is subject to charset
conversions, so you can't store arbitrary binary data in such columns
unless you somehow encode it into a charset-neutral form (like Base64)
or cast it to RAW and back using UTL_RAW package procedures. Since you
can't place constraints on BLOBs, your only obvious choice is RAW,
which is limited to 4000 bytes (that is, RAW(4000) is the longest
possible RAW column.)
Anyway, I urge you to reconsider your approach, which is flawed: using
PRIMARY KEY constraint on a long column to enforce uniqueness of
binary data of up to 10kb in length stored in that column probably
won't work unless you use large block size and a composite primary key
over two or three RAW columns into which the data is broken down on 4k
boundaries, and even in this case it will be highly inefficient way to
ensure uniqueness of your binary data. What you could do instead is
this:
CREATE TABLE MY_TRICKY_TABLE (
ID NUMBER(20) PRIMARY KEY
,HASH$ RAW(40)
,DATA BLOB
,CONSTRAINT UQ$MY_TRICKY_TABLE#HASH UNIQUE(HASH$)
)
and then hash the data using SHA-1 (see DBMS_CRYPTO package, Hash()
function,) and put the hash into the HASH$ column. You can do this in
a BEFORE INSERT FOR EACH ROW trigger on the table. You can get away
with 32 bytes per key if you use MD5 instead of SHA-1, but chance of
hash collision is slightly bigger with MD5 (still negligible unless
you're going to store gazillions of unique documents.) Note that since
HASH$ depends on DATA, it can't be the primary key, so I simply create
a UNIQUE constraint on HASH$ to enforce uniqueness of DATA through its
hash and introduce a surrogate primary key ID, which will uniquely
identify all rows in the table and can be used for referential
integrity.
Hth,
Vladimir M. Zakharychev
N-Networks, makers of Dynamic PSP(tm)
http://www.dynamicpsp.com
Correction to the above: obviously, I mixed hash lengths in bytes and
lengths in characters of their corresponding hexadecimal
representations. SHA-1 raw hash size is 20 bytes and MD5 is 16 bytes,
so declare the HASH$ as RAW(20) or RAW(16). 40 and 32 would be
appropriate for VARCHAR2, respectively.
Regards,
Vladimir M. Zakharychev
N-Networks, makers of Dynamic PSP(tm)
http://www.dynamicpsp.com
.
- Follow-Ups:
- Re: storing byte values
- From: pascal . dulleck
- Re: storing byte values
- References:
- storing byte values
- From: pascal . dulleck
- Re: storing byte values
- From: sybrandb
- Re: storing byte values
- From: pascal . dulleck
- Re: storing byte values
- From: Vladimir M. Zakharychev
- storing byte values
- Prev by Date: Re: ORA-29506 when compiling java
- Next by Date: Re: How to compare NUMBER's
- Previous by thread: Re: storing byte values
- Next by thread: Re: storing byte values
- Index(es):
Relevant Pages
|