Forum OpenACS Development: Best way to join with latest revisions

I have three different tables managed with the CR which I need to join for Listbuilder. At the moment I'm doing this:

select distinct t.solution_id, cr.title, t.date, t.name_title, t.themes, o.organisation_id, o.organisation_name, c.surname, c.first_name, c.contact_id
from cr_folders cf, cr_items ci, cr_revisions cr, solutions t, (select organisation_name, idcompany, organisation_id from
csr_organisationsx ox, cr_items ic where ic.latest_revision = ox.revision_id) o,
(select contact_id, idcontact, first_name, surname from contactsx cx, cr_items ic where ic.latest_revision = cx.revision_id) c
where cr.revision_id = ci.latest_revision
and t.solution_id = cr.revision_id
and ci.parent_id = cf.folder_id
and cf.package_id = :package_id
and o.idcompany = t.idcompany
and t.idcontact = c.idcontact
[template::list::page_where_clause -and -name "solutions"]
[template::list::filter_where_clauses -and -name "solutions"]
[template::list::orderby_clause -orderby -name "solutions"]

I refuse to believe that this is the most efficient way to make sure that I only join with the latest revision (or live revision once we change to that). Any hints?

Collapse
Posted by Claudio Pasolini on
When dealing with the CR I usually add a couple of views, i.e.:
create or replace view mis_active_products as
        select x.*
        from   cr_items ci, mis_productsx x
        where  ci.item_id    = x.item_id and
               x.revision_id = ci.live_revision;

create or replace view mis_fast_products as
        select p.*, ci.name, cr.*
        from   cr_items ci, mis_products p, cr_revisions cr
        where  p.product_id = ci.live_revision and
               p.product_id = cr.revision_id;
The first one enables a cleaner code, because I can use it without worrying about joining with cr_items and cr_revisions. I use the second one for performance reasons, because I noticed that when using the auto generated mis_productsx view the resulting query plan is really awfull.
Collapse
Posted by Jeff Davis on
I wonder if we should consider adding live and latest revision
to the OBJi and OBJx views or possibly a OBJ(some other letter) view which would include those. The only issue I
can see in adding them to the OBJx view is that you might break some queries which do the style of joins you are
talking about here.

Then the query would be "select ... from contactsx x where x.latest_revision = revision_id" as we are already joining cr_items inside the view anyway.

It's a pretty common use case I think.

Collapse
Posted by Dave Bauer on
Claudio,

Can you post the queries and plans where the automatically generated view is much slow than your view? I can see that your view does not join on acs_objects.

Collapse
Posted by Dave Bauer on
Jeff,

I agree that adding latest revisio n and live revision is a good idea. It would break all the CR based code I have been writing though.

I suppose it would not be too difficult to update that code and it definitely would make it more readable.

Collapse
Posted by Jeff Davis on
Dave, could you post an example of something that would break?
Collapse
Posted by Claudio Pasolini on
Dave,

here is an example using a view wich in turns uses the CR auto generated view:

oacs51=# explain select x.item_id,
               title, 
               upper_title, 
               product_code,
               x.is_active_p,
               u.um_code,
               x.brand_id,
               x.category_id
        from  mis_active_products x, mis_um u
        where  x.item_id = 1 and
               x.um_id   = u.um_id;
QUERY PLAN

 Nested Loop  (cost=24.94..31.10 rows=5 width=647)
   ->  Seq Scan on cr_text  (cost=0.00..1.01 rows=1 width=0)
   ->  Nested Loop  (cost=24.94..30.05 rows=4 width=647)
         ->  Index Scan using cr_items_pk on cr_items i  (cost=0.00..4.99 rows=2 width=4)
               Index Cond: (1 = item_id)
         ->  Materialize  (cost=24.94..24.97 rows=3 width=647)
               ->  Nested Loop  (cost=5.00..24.94 rows=3 width=647)
                     ->  Nested Loop  (cost=5.00..20.10 rows=1 width=638)
                           ->  Nested Loop  (cost=5.00..15.27 rows=1 width=43)
                                 ->  Hash Join  (cost=5.00..11.75 rows=1 width=39)
                                       Hash Cond: ("outer".revision_id = "inner".live_revision)
                                       ->  Index Scan using cr_revisions_item_id_idx on cr_revisions cr  (cost=0.00..6.73 rows=2 width=35)
                                             Index Cond: (item_id = 1)
                                       ->  Hash  (cost=4.99..4.99 rows=2 width=8)
                                             ->  Index Scan using cr_items_pk on cr_items ci  (cost=0.00..4.99 rows=2 width=8)
                                                   Index Cond: (1 = item_id)
                                 ->  Index Scan using acs_objects_pk on acs_objects  (cost=0.00..3.50 rows=1 width=4)
                                       Index Cond: ("outer".live_revision = acs_objects.object_id)
                           ->  Index Scan using mis_products_pkey on mis_products  (cost=0.00..4.82 rows=1 width=611)
                                 Index Cond: ("outer".live_revision = mis_products.product_id)
                     ->  Index Scan using mis_um_pkey on mis_um u  (cost=0.00..4.82 rows=1 width=17)
                           Index Cond: ("outer".um_id = u.um_id)
(22 rows)

Avoiding the CR generated view things are much better:

oacs51=# explain select x.item_id, 
               title,
               upper_title, 
               product_code,
               x.is_active_p,
               u.um_code,
               x.brand_id,
               x.category_id
        from  mis_fast_products x, mis_um u
        where  x.item_id = 1 and
               x.um_id   = u.um_id;
 QUERY PLAN

 Nested Loop  (cost=0.00..36.46 rows=3 width=647)
   ->  Nested Loop  (cost=0.00..26.79 rows=2 width=638)
         ->  Nested Loop  (cost=0.00..17.12 rows=2 width=39)
               ->  Index Scan using cr_revisions_item_id_idx on cr_revisions cr  (cost=0.00..6.73 rows=2 width=35)
                     Index Cond: (item_id = 1)
               ->  Index Scan using cr_items_by_live_revision on cr_items ci  (cost=0.00..5.18 rows=1 width=4)
                     Index Cond: ("outer".revision_id = ci.live_revision)
         ->  Index Scan using mis_products_pkey on mis_products p  (cost=0.00..4.82 rows=1 width=611)
               Index Cond: (p.product_id = "outer".revision_id)
   ->  Index Scan using mis_um_pkey on mis_um u  (cost=0.00..4.82 rows=1 width=17)
         Index Cond: ("outer".um_id = u.um_id)
(11 rows)
You are right. I don't join with acs_objects because in most cases I don't need it and this makes a huge difference in performance terms.
Collapse
Posted by Jeff Davis on
Claudio, what are the times you get for explain analyze for those two queries?
Collapse
Posted by Claudio Pasolini on
The run times of the two queries are 51.204 ms and 1.355 respectively on my development laptop with very few data.
Collapse
Posted by Don Baccus on
Can someone explain why we need cr_text in Oracle (much less PG)?