Forum OpenACS Development: stable urls for all objects

Collapse
Posted by Timo Hentschel on
When displaying a list of objects, you probably want to provide urls to the objects in addition to the object name. But urls are a big problem if the objects are some other packages objects. So how to do this?

In the categorization package I solved that problem the following way: I added an index.vuh file in /www/o that accepts an object_id, queries the acs_objects and acs_named_objects (see my other posting) tables to figure out the object_type and package_id. It will then get the package-instance url to the package_id from the cache and invoke a service-contract to figure out the local url to display that object.

What is this service-contract about? I called it AcsObject.PageUrl and there should be an implementation for every displayable object type called $object_type _idhandler so that the appropriate implementation responsible for an object of a particular object type can be found easily. The implementations should accept an object_id and return a string directing to the page that can show that object i.e. "view?posting_id=$object_id".

The result of using this index.vuh page and the service contract is that we don't need to resolve the urls to some objects when displaying them, but do that work when the user actually clicks on an object to see it. This actually provides a stable url for every object in the system since (as i assume) object_ids will never change, the name of an object and the package-url certainly can.

Are there any thoughts about this? Can we go with my proposal?

Collapse
Posted by Dirk Gomez on
Yes, here's a thought which I already voiced on irc - Peter and Lars are probably the ones to answer that question.

What about I18N?!

Imagine I pass on "Look at http://www.foobar.com/o/666" to someone. The object shows up in German to me. This person has Italian as his default language. Now there is now Italian translation for that object. The server will fall back to French because this is the default language. /o/666 now shows a French translation that is or is not a good translation of the object I intended to show to my biligual Italian friend.

So - *without* knowledge about the chosen I18N solution - did I start to worry about the approach we used for stable URLs in a *monolingual* system.

Here's a rough proposal: change the request processor so that it accepts LC_$ISO_LANGUAGE_CODE somewhere in the URL. I could pass on "http://www.foobar.com/LC_de/o/666" *and* "http://www.foobar.com/LC_it/o/666" to him and tell him..."look how bad they are at translating".

It would also give the user to alter the default language setting by just fiddling with the URL a bit.

Collapse
Posted by Don Baccus on
Changing the request processor to know about locales was exactly the approach I took with Greenpeace Planet.  In GP's case things are a bit more complex because we map to both a NRO (National/Regional organization - i.e. GP Nederlands vs. GP Deutchland) and language (think GP Canada/French vs. GP Canada/English.)

The legal mappings are all nsv-cached at start-up and kept consistent by the admin UI that manages the mappings, so there's no database hit.  We also use cookies to remember preferences for the user between sessions, if they have cookies enabled in their browser.

If you guys want to look at the code you're more than welcome to, of course, the GP code is GPL'd.

Collapse
Posted by Timo Hentschel on
Since no real objection has been voiced, I consider my proposal of having stable urls for all objects as accepted, but I want to raise the question of how to achieve this once again.

Can we go with having a service contract to figure out the local url to display an object that every package has to implement for every object type it provides? Would the name of AcsObject.Url be acceptable?

Should we have a global vuh-page as i suggested - possibly extended with optionally letting the user add a locale to further specify the object translation (not sure how to implement that one)? Advantage: we could have a really easy to remember mechanism like /o/$object_id. Or do we want to go the normal way of providing a global page like object-display?object_id=$object_id ?

Collapse
Posted by Tom Jackson on
Since no real objection has been voiced, I consider my proposal of having stable urls for all objects as accepted...

Maybe implement it as an optional package, for those who might not want to use it? I don't think anyone can object until more is known.

Collapse
Posted by Dave Bauer on
I like the index.vuh approach, but if we use an ad_returnredirect at the end of the file that does the dispatching, it might be more efficient to use an object-display?object_id=12345 type of page.
Collapse
Posted by Dave Bauer on
Tom,

It really makes sense for packages to have a tcl proc that generates a URL for user-visible objects. We already have a URL service contract for search, and for notifications. Do we need another for categorization, or should be choose a standard service contract that can return the URL of an object?

I have another idea for the index.vuh. It should live at the root of the site, and accept just an interger as the last part of the URL http://example.com/foo/12345.

This would allow creating a URL that contrains an object_id as the last part with an arbitrary path before it. This would allow pretty URLs for packages with less work. Just implement the URL service contract.

Collapse
Posted by Timo Hentschel on
That was exactly what i was thinking of. Actually, I wanted to put it in the global /www/ dir, but we could make it a package to be mounted somewhere. I do think this should make it in the core and every package should provide a local url for their objects since we already need that for search, notification and categories.

Here's the code i was thinking of:

db_0or1row get_object_data {
    select o.object_type, n.object_name, o.package_id
    from acs_objects o, acs_named_objects n
    where o.object_id = :object_id
    and n.object_id = o.object_id
}

set package_urls [db_list package_urls {
    select site_node.url(node_id)
    from site_nodes
    where object_id = :package_id
}]

# If there is more than one URL, we pick the first one.
set pkg_url [lindex $package_urls 0]
set impl "$object_type\_idhandler"

if {![acs_sc_binding_exists_p AcsObject $impl]} {
    ad_return_warning "Unable to resolve url"
    return
}

set object_url [acs_sc::invoke -contract AcsObject -operation Url -impl $impl -call_args $object_id]

ad_returnredirect "$pkg_url$object_url"
Collapse
Posted by Tom Jackson on

Display of objects is application specific. Permission procedures need to run for every object, which includes permissions on a specific subsite. This means that the url must point into the package itself, or it will require someone to design new pages for display of individual objects. For a lot of objects it doesn't make sense to display them out of context.

In some cases, I would even venture to say that the existance of an object_id should not be visable, the returned url could give away information all by itself.

Maybe if this functionality was only mounted in an admin area would I consider this a good feature.

In short this sounds like a new requirement without a purpose. If it is optional, I don't care. Otherwise I care: I don't want my package's objects to be displayed out of context, or at least I want the choice to not display them.

If there is a function that returns the url, the function must do the same security checks as required by the package to see a link to the object. This means the function may actually need to check permissions on a parent object.

Collapse
Posted by Dave Bauer on
Tom,

Because display of objects is application specific, we dispatch the display of the object to the URL specificed by a Tcl proc that belongs to that package.

There may be some issue with permissions, but it is not quite what you think. The object redirect package would just redirect to a package specific URL which could do the permissions checking. In the category package, there might be a need to limit listings of certain objects. This is also the case for the search package which also accesses a service contract to display an objects URL.

So this object display page will not display a list of objects, or the object's contents. All it does is redirect to the appropriate URL. Nothing is displayed out of context.

Collapse
Posted by Tom Jackson on

I don't think adding these new requirements to packages is useful. Why can't you create an add-on package, not in the core, which packages can use if the developer wishes?

One nice feature of using acs_objects is that the developer gets a lot out of that: permissions, global object_id. This 'feature' isn't for the developer: s/he has already decided on the interface best suited to display. This sounds very much like another feature which will just slow down a server from performing meaningful work.

I would venture to guess that the vast majority of acs_object types were never intended to be displayed on a page by themselves, out of context. Why should developers be forced to do this? The one instance I know of where an object is displayed by itself is the forums package. The message content can be displayed by itself. I can't express how useless this page is. Why is it even there?

If this is mainly an idea that will be used with the content repository, then make it a part of that package. Make the service contract requirement, if any, apply to packages that use the content repository.

Collapse
Posted by Dave Bauer on
Tom,

As we have previously stated, many packages are already generating this information. We do _not_ want to display object data out of context.

Many packages implement a url service contract for serach or notifications. Other packages don't support search or notifications.

I don't see the fuss over making it easier for package developers to work with objects. Most packages already have a tcl proc to link to objects within that package. All a service contract would do it allows packages such as categorization or search, etc to present a list of objects to the user.

Collapse
Posted by Dirk Gomez on
Tom, we are talking about adding *one* file. If you don't find it useful then don't use it. (However it is quite likely that other packages will be picking up if it gets accepted - which it should.)

This functionality is not about showing random objects, but it'll provide for *extremely cheap* creation of links to OpenACS-internal objects. The current approach is *very expensive*.

About your security concerns: the permissioning will still be done by the target page, nothing is shown out of context. The *tricky* thing is that the redirect must not take place if the user doesn't have appropriate permissions on the target page, because the URL may already contain sensitive information. So the redirecting page should contain the same permissioning code like the "view" page.

Collapse
Posted by Tom Jackson on

What am I worried about?:

Can we go with having a service contract to figure out the local url to display an object that every package has to implement for every object type it provides?

As long as there is no requirement to implement this for every object, great.

I would think that every package instance could elect to not have its objects included in the search? If all instances of a package elected to be private, I would hope that the existence of the object types in that package would also remain private.

If not, this is just a back door to what was supposed to be a secure system.

Why, again, does this need to be in the core?

Collapse
Posted by Malte Sussdorff on
Why does this need to be in the core: Because I expect a couple of packages to rely on this. Furthermore, because at least one hopefully to become core package (categorization) relies on it. IMHO these are good arguments. But maybe we should start a discussion for guidelines when to include a package to core in the first place.
Collapse
Posted by Timo Hentschel on
Sure, if a package owner decides that his objects will not be available for search, notification and categories then this is perfectly fine to not implement the service contract and maybe even not to name the object (like in acs_named_objects - see other thread). I just assumed that most of the packages do indeed contain searchable and categorizable content so there would be a need for the search and categories package to display the object names and provide links to the objects. And I think that such functionality (object names and object links) should be provided by the core so that packages could rely on the interface to be there - if it is being used totally depends on whether the site-owner mounts the search or categories package.
Collapse
Posted by Dirk Gomez on

Tom, the /o/$object_id is merely a replacement for the site_node.get_url function if you want to create lists of links. It defers the expensive process of calculating the URL. The assumption here is that there fairly many pages in a web system that show fairly many links. Take portal: a regular portal page may show 5 portlets with 10 links each. Your site calculates 50 URLs although the user can click at most only one at a time.

On an Oracle-backed OpenACS system with 329 site nodes this is the price:


SQL> -- gee, no bind vars
select site_node.url(node_id)
          from site_nodes where object_id= 93013;

Statistics
----------------------------------------------------------
          5  recursive calls
          0  db block gets
         12  consistent gets


site_node.url is a recursive function that gets more expensive the more nodes you add.

In short /o (or LC_$ISOCODE/o) would supersede most calls of site_node.url thus saving overall load on the system and will not compromise security.

Collapse
Posted by Jun Yamog on
Hi,

Forgive me if my post it not too well informed.  Dave pointed out the url service contract for search.  I have used this a couple of times.  Can't the new sc for this url use it or make things something more common?

I think there is a common functionality on what Timo would like to do and search's url sc.

Collapse
Posted by Tom Jackson on

I thought one of the principals of a dynamic website was to never display links that a user cannot follow. Another related principal would be to give a little more information on what the link might show, like the number of new posts in a link to a forum.

For a portlet, only links the user has privileges to view should be displayed, otherwise the user presented with 50 links, with a number of bad ones, might hesitate to click, fearing rejection. In other words, the links are pre-qualified.

The service contract idea for providing a url to linkable objects sounds great. I really do like it. Besides a tcl function, a wrapper in the form of a url is very handy.

My concern is with displaying information about an object in an unqualified way: just digging through the database, going around an application and showing information before security checks can be performed. This gets back to your main point: it is expensive to do security checks on a huge list of objects, and calculating the url is maybe worse or of the same magnitude. This is where application logic is important: providing a useful, secure and cheap navigation scheme for displaying objects. A portlet would presumably do the same thing, taking a user_id and providing qualified links, or object_ids and an object name or summary.

In the case of a forum portlet, probably only one permission check would be performed on the forum, and not on every thread or message.

I'm just wondering out loud here: what if the package developer could provide views? Instead of individual objects, for which a permission is checked, any whole chunk of useable data which could be tied to an object_id and permission checked with a user_id could be queried. For instance, in a forum package, besides the links to new messages or threads, you could provide a list of who posted the last few messages, kind of a whose on line. In this case the forum_id and the view name would be provide. Maybe that is what a portlet does, but my point here is that a single link to an object may end up being a little limiting. If you could specify a view, a series of urls that contain the object could be used. None of the urls would need to be specified ahead of time, just a link to the redirection url including the object_id and view. So the forum portlet, view as a whose on line, would return a series of user names and user_ids, and the view you should use to look a the user object. Maybe in this case the view would be of the number or list of posts to this forum_id by the user.

Now would a view need to be provided by the developer? Would a view need to be a part of the package that created the data? Not necessarily, maybe it would be better if views were separate packages: unmounted services. So anyone could create a view and distribute it. When you mount a package instance, you could establish which views you want to apply to that instance.

Collapse
Posted by Timo Hentschel on
Tom, I'm right with you when you're saying that you shouldn't display objects the user hasn't any right to see. So we definately need to have (expensive) permission-queries when displaying object-lists.

I just had a look at FtsContentProvider.Url and it looks good although there are a few (minor?) drawbacks:

- every implementation has to get the site-node-url (in my proposal i thought that this could be done by the redirecting-page so that the service-contract implementation just needs to return the local url in a package to display an object)

- maybe it would be nice to be able to use the redirecting page for objects not to be stored in site-wide search (because they don't contain searchable data) like apm_packages, users, forums_forum and the like) so maybe we should decouple the object_id -> url translation from search

Collapse
Posted by Dirk Gomez on

Tom - *nothing* in the user interface but the links target text will change. I repeat: *nothing* but the links target text.

The fact that you can follow only one link per click is a constraint by a web browser - how would you follow three links with one click?

The fundamental assumption is: If url creation is expensive - which it is a fair bit in OpenACS - then defer it till you actually really need it.

How do you come to think that a /o/ link would be be shown without description?

/o/ will supersede the PL/SQL function site_node.url(node_id). Your new query will look like this: select '/o/' || object, ... from ... where...;

instead of select site_node.url(node_id) || '/view-object?object_id=' || object_id from ... where...;

It's a light-weight change. Nothing fancy and straight-forward!

Collapse
Posted by Tom Jackson on
How do you come to think that a /o/ link would be be shown without description?

Dirk,

If the description is part of an object, and the object has a read permission, then you need to run a permission check before dispaying the description. This is exactly what I fear: an application that rummages around displaying private information, unchecked.

All I am saying is that you shouldn't create a back door to data, of any type, period. Now the redirection page, as I now understand it, would just take an object_id. Assuming the page didn't give away any information in case of a permission failure, then there is nothing wrong with it. I think it is a good idea. But an application that lists objects (showing the name or a description) without a permission check is just bad. It isn't what OpenACS was designed to do. The fact that it is easy to do it shouldn't be a surprise to anyone. But it is a bad idea.

Collapse
Posted by Timo Hentschel on
Tom, as I already said, I'm right with you. We do have to do permission checks before displaying any object_name in any list of objects. I totally agree that there should never be a backdoor so that users can get information that they're not permitted to get.
Collapse
Posted by Dirk Gomez on
Tom, I don't really see where you find that lists of objects be shown without proper permissioning checking.

The redirecting page introduces one new security threat: it may be redirect to a page to which you don't have permission. The URL might be descriptive enough to give you a hint. Think of getting redirected from /o/123456 to /your_division/file-storage/layoffs_q3_2003.doc. A program could just try counting up from 1 to xxx and capture the target URL. That is why - as already stated at least once in this thread - we need exactly the same permissioning code on /o/ as on the target page. If it yields a false, we need to show: this object doesn't exist, otherwise we already tell the user that the object exists.

Otherwise security threats remain the same and if the respective target page is insecure, then the redirecting page is insecure.

Currently, if the source page lists links to which you don't have permissions, then it leaks information. If it does proper permissioning checking, then it'll remain secure with /o7. As I said: the only thing being changed is the target link text.

The "stable URLs" proposal is not about listing objects - it is about supplying stable URLs based on the (hopefully) never-changing object_id. A very very welcome side effect is that it is a very very lightweight replacement for get_urls PL/SQL functions.

Collapse
Posted by Dave Bauer on
See this thread: https://openacs.org/forums/message-view?message_id=144560 on merging the concept of cr_folders and site_nodes.

We could abstract the folders further to allow them to contain any object.

By assigning one folder to each site_node and naming objects that should have URLs (and falling back to object_id as the name if its not named) any URL can be calculated efficiently on a display page.

The URL of each site_node is cached in an NSV array, so appending the name or object_id to the site_node would be efficient.

I think this is a good area to explore for future versions of OpenACS (post 5.0).