[redland-dev] Messing up bdb
Jasper van de Gronde
th.v.d.gronde at hccnet.nl
Thu Jun 16 11:05:25 BST 2005
Danny Ayers wrote:
>>On 6/15/05, Dave Beckett <dave.beckett at bristol.ac.uk> wrote:
>>
>>>Yes, redland uses a 2-byte field to store the length of literals
>>>when it encodes and stores them. Suggestions what to do when it's too
>>>long are welcome - probably at the least it should warn.
>
> What are the limits on URI strings? Look like I'm getting corruption there too.
>
> (255 would seem a good choice, except that might cause problems with
> SPARQL queries...)
I'd suggest either using a comfortably high limit (32bit integers come
to mind), or a variable limit, using something similar to what is used
by EBML (length starts with a bitstring telling you how many bytes to
read, but you can also use an encoding that sets the highest bit if
another byte should be read for example), which has the advantage of
using a nearly optimal amount of storage space and is extremely scalable
(although I think it would be overkill in this case), but is also
slightly slower (although it's probably not that bad).
A third option would be to somehow let the user of the library easily
change the limit (although that might give some compatibility issues if
you're not careful).
Using a fixed limit that's too close to sane values will only lead to
trouble.
And it would indeed be good to check whether or not a literal is too
large when using a fixed limit.
More information about the redland-dev
mailing list