Home
The Toolkit for Online Communities
15888 Community Members, 0 members online, 1711 visitors today
Log In Register

Forum OpenACS Development: tree_sortkey varbit limitations

OpenACS Home : Forums : OpenACS Development : tree_sortkey varbit limitations

Icon of Envelope Request notifications

+
Posted by Tom Jackson on

I was hoping to use the acs_objects use of tree_sortkey to create a hierarchy. I just use the context_id to point to the parent object, and the tree_sortkey gets updated like magic.

Here is some example data from acs_objects:

object_id |                   tree_sortkey
-----------+--------------------------------------------------
       370 | 00000100000000000000001100000000
       371 | 0000010000000000000000110000000000000000
       372 | 000001000000000000000011000000000000000000000000
       374 | 000001000000000000000011000000000000000000000001
       373 | 0000010000000000000000110000000000000001
       375 | 00000100000000000000001100000001
       376 | 0000010000000000000000110000000100000000

I used the following db_multirow to create a visual representation of the hierarchy:

db_multirow -extend {
    indent
} categories blog_categories "
select
 c.*,
 o.context_id,
 o.tree_sortkey
from
 blog_categories c,
 acs_objects o
where
 c.category_id = o.object_id
order by
 o.tree_sortkey" {

     set varbit_length [string length $tree_sortkey]
     set indent_amount [expr ($varbit_length -32) / 4]
     set indent [string repeat "." $indent_amount]
 }

Each varbit is at least 32 chars long, and each child varbit is an additional 8 chars, so the above code adds two indent chars per child.

Now the question: it looks like this implimentation only allows for 256 children, or 8 bits. It this true, or is there more magic hidden in the pl code which maintains this?

+
Posted by Jun Yamog on
Hi Tom,

I am a bit sleepy now, although not directly answering your question.  You can look at tree_level call instead of getting the length of tree_sortkey, etc.

+
Posted by Tom Jackson on

Thanks Jun, that code seems to confirm that the toplevel has 32 bits of slots open, while children are limited to 8 bits per level. I'm not sure the using the tree_level would be any more efficient than the tcl code, they do pretty much the same thing, but tree_level is more general. I would still need to extend the db_multirow, or at least run a foreach to create the formatting.

+
Posted by Don Baccus on
You'll note that the 32 bit (*not* char BTW) sortkeys have leading zeros while the 8 bit sortkeys have leading ones ...

I presume that's enough to answer your question?  If not check the sortkeys for the first 256 objects in your install :)

+
Posted by Don Baccus on
BTW it's easy to calculate your indention string in the query itself using tree_level() - this isolates you from the consequences of any change in representation (not that I expect there to be any but still ...)
+
Posted by Don Baccus on
Sorry flip my zeros and ones above ... keys 0..127 (not 255 :) and 32-bit keys start with 1. So your examples above are more deeply nested trees than you think:
openacs46=# select tree_level('00000100000000000000001100000000');
 tree_level 
------------
          4
(1 row)

openacs46=# 

Also if you calculate the indention string directly in your query you don't need to extend or db_foreach, just build a string with the right number of spaces:

lpad(' ',(tree_level(fs_tree.tree_sortkey) - 1), ' ') as spaces

+
Posted by Tom Jackson on

Okay, I'm looking further. It seems from the level code that maybe if you go over the limit for the short 7 bit key plus the leading zero, you get a 31 bit key, plus the leading one.

So the tcl code I wrote would break pretty quickly in normal use. But will every installation start out at the same level? Maybe not? I might need to get the level for the root of my tree first.

+
Posted by Tom Jackson on

So here are the results of adding over 127 children to one parent:

 object_id |                           tree_sortkey
-----------+------------------------------------------------------------------

       518 | 0000010000000000000000110000000001111100
       519 | 0000010000000000000000110000000001111101
       520 | 0000010000000000000000110000000001111110
       521 | 0000010000000000000000110000000001111111
       522 | 0000010000000000000000110000000010000000000000000000000010000000
       523 | 0000010000000000000000110000000010000000000000000000000010000001
       524 | 0000010000000000000000110000000010000000000000000000000010000010
       525 | 0000010000000000000000110000000010000000000000000000000010000011
(134 rows)

So the pl code smoothly transitions to a larger key. Cool!

+
Posted by Tom Jackson on

One last note: besides being correct to use the tree_level function, it is significantly faster, probably an order of magnitude. The db_multirow ended up being:


db_multirow categories blog_categories "
select
 c.*,
 o.context_id,
 lpad('.',(tree_level(o.tree_sortkey) - 4), '.') as indent
from
 blog_categories c,
 acs_objects o
where
 c.category_id = o.object_id
order by
 o.tree_sortkey"

+
Posted by Don Baccus on
Yes, that's it exactly, Tom - at the 7 bit limit I add a leading 1 and 31 bits of key data.

This ensures that 128 > 127 when compared as bitstrings:

127 = 01111111
128 = 10000000000000000000000010000000

PG bitstrings are compared just like bytestrings, in other words they're matched bit-by-bit until there's a mismatch, then the larger bit wins.

A little thought should convince you that this works for sortkeys attached to nodes at any level in the tree, of any number.