Here are 2 pure SQL queres may help you to a bit.
Query to get all complete multi-bytes characters to fill in a SQL varchar2 type
It uses a recursive query to do binary search to get the maximum character length:
with b4000 (id, e, s, len)
as
(select rowid as id, 666 e, 666 s, lengthb(dbms_lob.substr(text, 666, 1)) len from clob_src
union all
select b1.id,
case when b1.len + lengthb(dbms_lob.substr(text, s, e)) <= 4000 then e + s else e end,
case when b1.len + lengthb(dbms_lob.substr(text, s, e)) <= 4000 then s else trunc(s/2) end,
case when b1.len + lengthb(dbms_lob.substr(text, s, e)) <= 4000 then len + lengthb(dbms_lob.substr(text, s, e)) else len end
from clob_src s1, b4000 b1
where s1.rowid = b1.id and case when b1.len + lengthb(dbms_lob.substr(text, s, e)) <= 4000 then s else trunc(s/2) end >= 1
) select * from b4000
* Oracle only holds 4000 bytes and a UTF-8 character can take up to 6 bytes, that comes the magic number 4000/6 ~= 666. It will be the number of characters we pull out from CLOB each time. When it reaches the 4000 bytes limits, we cut the step by half recursively until the the step reaches to 1.
Query to get bytes length of your CLOB column:
with blen(id, i, len, lenb)
as (
select rowid as id, 0, length(dbms_lob.substr(text, 666, 1)), lengthb(dbms_lob.substr(text, 666, 1)) from clob_tgt
union all
select b.id, i+ 1, len + length(dbms_lob.substr(text, 666, (i+1)*666 + 1)), lenb + lengthb(dbms_lob.substr(text, 666, (i+1)*666 + 1))
from clob_tgt s, blen b
where s.rowid = b.id and len + length(dbms_lob.substr(text, 666, (i+1)*666 + 1)) <= dbms_lob.getlength(text)
) select * from blen
No comments:
Post a Comment