Forum OpenACS Development: Re: The future of OpenACS

Collapse
Posted by Stefan Sobernig on
Malte,
WEBSERVICES
There is certainly need for a co-ordinated initiative along these lines, but I do fear that web services alone are not the killer app OpenACS is looking for. At the last OpenACS conference, we all sat together and shared our requirement sets which told kind of a fairy tale of what is needed. And these do not strip down to web services as such. I will try to sum up some key issues discussed back then in April further down, in appreciation of your list of issues and to foster further discussion.
I am aware that I will get lot's of bashing about this for not understanding the concepts of Webservices
At least not from my side ;) The scenarios you sketched in some one-liners are certainly valid, however, they might not directly be related to web services, they could be solved (better, more elegantly, ...) by alternative approaches to the same idea of remoting or don't even require middleware as such. But, I think, this is the wrong place to go into further details ... I'd like to continue your requirements gathering, by listing and commenting some requirements we came up with back in April. Let's rephrase "web services" to something more general, such as "remoting". hope, you agree.

interaction styles

While this a quite generic term, it subsums all that has been discusse under will buzzwords such as RPC, Document or Message style, asynchronous remoting, eventing (more recently). These discussion usually boil down to flame wars on terminology: What and why should we support either of these? Or, are they the very same? Back at our get-together in April, we collected 4-5 scenarios, mainly by you, Malte, Frank, Nima and Neophytos, if I remember correctly. In view of interaction styles, I got the impression that there is an articulated need for support of "data exchange & integration scenarios" rather than conventional component integration (or integration of remote functionality). This would shift the requirement set in this category. It simply is an area of important strategic decision making.

single vs. multi-protocol support

Why not providing/ exposing functionality and data over different protocol channels (SOAP and XML-RPC), without the need of multiplying the coding effort (e.g.,rewriting the serving bits of code, interface descriptions)? In fact, multi-protocol support is something rather new in framework design, most established frameworks are rather short-handed in this respect. Note, by multi-protocol i do not simply mean providing an xml-rpc and soap package and having the individual developer deal with exposing their code to them. There should be a certain protocol transparency from the developer's perspective. I don't want to give examples from literature, but just a hands-on example: Ruby on Rails comes with a somewhat multi-protocol support (at a rather low level, for my taste) in its Action Web Service module. By switching around some parameters, you either go for an XML-RPC or SOAP call/provision, through a common interface.

developer support

This is an area to gain grounds (at least with respect to other web development environments).
  • "scaffolding environments": This is the place where your "ad_form" scenario goes, but this might be easily amended by considering such a form-support for xowiki.
  • as simple as is, a nice (xslt-based) annotating wsdl/wadl reader for developers (this has been pioneered before, can't find the link now)
  • code generation from wsdl/wadl (or better dynamic stub creation)
  • testing environments, i.e. integration with an automated test environment to allow for testbed runs of remoting actions.
  • some (web-based) logging and monitoring functionality, something even lacking in most commercial platforms.

inter-framework compatibility

The nasty thing about remoting (especially following the illusion created by standards) is that it only pretends to make applications interoperate. the list of incompatabilities is never ending, did anyone out there suceeded to interact with .NET remoting from whatever environment in the intial and first run? This is certainly beyond our influence (partly), but it might be a good selling point to come up with a solution addressing these. And a couple of these issues can smoothly be resolved or just properly documented. There are a couple of directions at hand.

security & remoting

Most would summarize this requirement by calling for support of WS-Security (and related stuff) as propagated by OASIS and WS-I industry groups. However, most problems (apart from implementing this rather complex security schemes) occur at lower levels (what about ssl/tsl?) or on-top of WS-Security. Just to name one: How to design and realise configurable access to functionality and data offered over remoting channels? Some would now say, okay let's go for WS-Policy, but this is another major task. Rather straight-forward but powerful "invocation access policies" (note, by access I do not mean "access control") date back to the time of early CORBA. A basic scheme can be implemented quite straight forward, similar to xowiki's policies (xorb actually features such policies). To summarize my main messages:
  • "Web services" is not enough. This is both a frustrating and encouraging statement. Frustrating, because it means considerable efforts to provide some basic support. Encouraging, because we do not need (and should not) reproduce an existing scheme, instead look for a niche and go for it.
  • I highlighted some possible niches above, note, remoting support for web environments means something different than writing yet another AXIS, gSOAP, .NET Remoting or the like. It needs different focus (developer support, ...)
  • It can't be done as one man's show ;) and needs some backing (funding, dedicated dev time, ...). But this goes without saying, actually.
Collapse
Posted by Tom Jackson on
Why not providing/ exposing functionality and data over different protocol channels (SOAP and XML-RPC), without the need of multiplying the coding effort (e.g.,rewriting the serving bits of code, interface descriptions)? In fact, multi-protocol support is something rather new in framework design, most established frameworks are rather short-handed in this respect. Note, by multi-protocol i do not simply mean providing an xml-rpc and soap package and having the individual developer deal with exposing their code to them. There should be a certain protocol transparency from the developer's perspective. I don't want to give examples from literature, but just a hands-on example: Ruby on Rails comes with a somewhat multi-protocol support (at a rather low level, for my taste) in its Action Web Service module. By switching around some parameters, you either go for an XML-RPC or SOAP call/provision, through a common interface.

Everything should be automatic! This is like ASP.NET, just add a few decorations to you code and it becomes a web service...or not. Actually if you want to have complex types or validate the input you are totally screwed, now you need to write your own code for doing this. Apache AXIS is worse. Although it _is_ true that once the bits get to the protocol level the developer doesn't need to know anything, but by that time who cares? There are no switches to throw anywhere in any system, and you still lack the most important ingredient: a published description of the service and how to access it. Maybe somebody will recognize the importance of following a standard so that external clients can actually use your interface.

Now, what are the issues with OpenACS? The main issue is that most OpenACS applications depend upon page views. By that I mean that pages serve as a procedure. This makes it nearly impossible to reuse this code by a different protocol. For instance, my query-writer package requires all form submissions to go through a special page, very similar to rpc, minus the xml. Until I write a procedure, or set of procedures which covers this functionality, no amount of developer support will help me.

Another issue specific to Tcl is important to recognize: Tcl does not maintain type information, ad_proc actually complicates the situation, and we know nothing about certain things like return types, use of uplevel/upvar, etc. Many programming languages know all this information and so it is 'easy' to serialize a data structure to XML. This is why there are so many rpc packages in every language. But rpc has many problems which were never solved because they are much harder. First problem is that server and client really need to use the same language for full interoperability. Otherwise, either one or the other needs to learn to handle conversions. Second is that it is unlikely that anyone anymore really wants to expose an internal procedure to the network. The best method for handling this is to write a wrapper procedure to restrict use or perform any additional checks that usually take place prior to execution. One of these 'checks' is validation. This is where the wheels fall off of rpc and it is why nobody, not .NET or AXIS performs validation out of the box! And by validation, in this situation I only mean simple validation like is 4 an int? Derived types and complex type validation...this isn't even supported if you wished to do it. Just make up you own method of figuring out if the input is valid, test, debug and hope.

So if the big boys are not offering this as a feature of their products, with billions spent on development, maybe it isn't so easy. Configuring a procedure as a remote application, or whatever it is called, will always require a certain amount of information. How to collect, store and expose this information so it can be reused by new protocols, maybe at the same time...that would be helping the developer.

Hmmm maybe look at tWSDL. tWSDL impliments a single protocol: SOAP with document/literal. What is probably not appreciated is that the protocol is just a few pages of code, and is chosen based upon the client request. If someone were to write their own 'binding' and protocol handler they could reuse the entire tWSDL code base. The service would be fully described by the WSDL, input messages would be validated prior to use. If XML was not used as the representation, then the protocol handler would use some other method besides tDOM to deserialize and some other method to serialize the result. Writing the code is the least time consuming part of this. Deciding what your protocol is and how it works will take the most time.

One thing has become clear over the last few years: developers are starting to recognize that interoperability is the key. Interoperability is achieved by restricting the number of options and requiring public descriptions of services which don't require out of channel communications.

A note about code generation from WSDL. TWiST does this, not just stubs creation like AJAX. But it is only for the server. I have begun work on a TWiST Client. The client can run along with the server, but it will very likely not require AOLserver, that is it will be pure Tcl. But I'm developing it against Google's AdSense service. Everything is https, so I'm using ns_https to pull in the WSDL to read. The service developed on AXIS and uses a feature not in tWSDL/TWiST: SOAP headers. Anyway, the goal is a client which is created from reading the WSDL with API creation for each operation. Creating a document and sending it should work just like in the TWiST server. The return document will be available, but there will likely be a series of generic procedures for turning the document (as a Tcl representation) into Tcl structures, lists, arrays, etc. Of course there will be optional or automatic validation on the return document.

Collapse
Posted by Malte Sussdorff on
The main issue is that most OpenACS applications depend upon page views.By that I mean that pages serve as a procedure. This makes it nearly impossible to reuse this code by a different protocol

I was wondering if this "problem" could not be solved by limiting the webservice to ad_form and listbuilder where pages that contain them can interact with them through the request processor. At least for ad_form this should be fairly easy to do, but maybe I am just having a good dream 😊.

Additionally i am wondering, if TCL does not maintain type information why don't we just expose this to the remoting functionality as a string. I understand all the benefits of having things typed and I think we could probably use your type checking from TWiST and make it generally applicable to ad_proc (just thinking out load, not looking into code), but my question would still be, is this such an issue for OpenACS.

I look forward to your client and if you need a dummy to test it against anything else but Google AdSense just make the code available, as we are (at the moment) looking into TSOAP and manual code generation for calling external WSDL compliant services.

Collapse
Posted by Tom Jackson on
With TWiST, you need a Tcl procedure to serve as the operation. An input message contains the arguments and the output message, obviously, the result. The only 'problem' is one of picking a procedure which exposes what you want. If the code which needs to execute is not packaged up this way, then the 'problem' is to write it, similar to what I would need to do for my qw.tcl file which processes all form submissions. Obviously my job is much simpler because I only have one page to deal with. On the other hand, I have the same issues for the applications which use query-writer: forms/pages which select, organize and display information. One advantage of query-writer is it contains a lot of data model information, so it could be used to auto-generate TWiST configuration files and create simple select (one or many) Tcl API and also use the qw_* API for insert/update/delete operations. (Since query-writer gets type information directly from the database, typing can be automated.)

Working with a procedure defined via ad_proc actually presents very similar issues as an application page. ad_proc serves many purposes, but the way it works makes it harder to expose as a web service. This is why TWiST works the way it does: configuration is done via <ws>proc, which provides a wrapper for the internal API. Inside this wrapper, you can specify any amount of code. Maybe you could source a tcl page and do something with the resultant state. Whatever goes on inside the wrapper, you just need to return a list which represents the output message.

One point about typing with TWiST: more important is the structural typing, such as the names and order of the complexType elements, how many of each of the elements, can it be missing, is there a default, if missing, does this indicate it is 'nil', etc. Otherwise, your return document will just be a smashed together mess, but internally TWiST does not care about types, just structure. Essentially the type information is a description of the template: what is the document going to look like. Nobody would ever suggest we just run a tcl script and return a list of all the variables, the results need to be formatted, and this is what the type specification does: defines the non-data parts of the document. The wrapper code doesn't need to know anything about XML or any of this configuration, it just needs to return a list which will be formatted into a document.

When the TWiST client is done it will only handle document/literal, whatever is available now is in the wsclient package of tWSDL. I think everything is in a web page, and it requires the ssl module to be installed, at least to use ns_https.