Forum OpenACS Development: NaviServer 4.99: Issues with "rename"?

Hi,

We've got an issue in ]po[ (Windows) with "rename". When calling "exec" from /ds/shell or any other part of the system, it says 'invalid command name "::exec_orig"'. That's funny, because calling "exec" directly after the definition works (see below). Any ideas?

Thanks a lot!
Frank

---

rename ::exec ::exec_orig

proc exec {args} {
# Processing program name
set procname [lindex $args 0]
set args [lrange $args 1 end]
...
set cmd "::exec_orig $procname $args"
return [eval $cmd]
}

# This works here!
exec "/bin/find"

Collapse
Posted by Gustaf Neumann on
Maybe you are running into a conflict with the exec definition of OpenACS [1]. Probably, ::exec_orig is not part of the blueprint. I do strongly recommend not to mess around with exec, especially not in the way you are using it (it is unsafe from tcl, var globbing + "eval" dangers). Why are you not leaving exec alone?

[1] https://openacs.org/api-doc/procs-file-view?path=packages%2facs-tcl%2ftcl%2fproxy-procs.tcl&version_id=4877495&source_p=1

Collapse
Posted by Frank Bergmann on
Hi Gustaf,

]po[ runs on both Linux and Windows (and other stuff if necessary). As part of ]po[ we use a lot (>10) of Linux commands like find, ifconfig, dot, ...

The easiest way to keep the Windows and Linux versions in sync is to use exactly the same commands on the TCL side, and have a custom version of "exec" that performs some path corrections etc. This way the TCL part is exactly identical. This is very, very important for testing etc.

We woul have to touch all "exec" calls in the entire system and replace them by "im_exec", if we don't manage to get a custom version of exec going.

I'll check the link below to check if we can work around the issues with OpenACS exec.

Thanks!
Frank

Collapse
Posted by Gustaf Neumann on
The exec implementation in OpenACS is built around nsproxy (in cases where nsproxy is available). The reasons for using nsproxy are reliability (tcl's exec has several problems with concurrency, which might lead to hangs and crashes, hopefully solved in 8.5.19) and resource consumptions (every [exec] calls fork, which requires to "copy"/preallocate the full vmsize, which leads in combination with zippy malloc easily to out-of-memory problems, when OpenACS is large and the memory is little (e.g. 4GB). These exec problems are issues in unix versions, therefore using nsproxy is highly recommended on unix systems to avoid problems. If PO replaces [exec] by its own breed, it reintroduces the mentioned problems leading to limited reliability.

what do you mean by "path corrections"? paths pointing to the correct binaries? are you aware about util:which? If PO uses only 10 external progs, the right approach is to locate these during startup which util::which, store the full paths to the binaries in variables (best in the blueprint) and use these variables in exec without runtime overhead.

-g
[1] https://openacs.org/api-doc/proc-view?proc=util%3a%3awhich&source_p=1

Collapse
Posted by Frank Bergmann on
Hi!

right approach

The priority is to get ]po[ V5.0 out and not to spend any more time on infrastructure. The "exec" method has never caused any problems in the past, so there ain't nothin' to fix if there ain't no problem (not sure about the spelling...)

It seems I spent too much time in Spain and the US, so that I doubt any "right approach" concept, unless there is a business case for it 😊

Frank

Collapse
Posted by Maurizio Martignano on
Dear Frank and Gustaf,
let me add some few points to this discussion.
When calling an external program/utility we basically need two things:
1. know where the program is (so that we can call it)
2. pass to it the proper parameters (in case they are needed).
Up to now in the discussion you have covered only point 1.
But suppose our program/utility requires as parameter a file name.
In this case file naming conventions are also important "inside" the command line parameters.
Things become even trickier when we used mixed models of file naming conventions. I'll try to explain: in Linux all executable binaries follow the same files naming conventions. In pure Windows, when all the executable binaries are all native Windows binaries, they all follow the same files naming conventions.
On the contrary when some of the binaries are Windows native and others are Cygwin (or MinGW) binaries they are not guaranteed to follow the same files naming conventions.

Hope it helps,
Maurizio

Collapse
Posted by Maurizio Martignano on
I'm adding an example to explain my point.
Let's take the following simple C Program:

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

#define LEN 2048

char command[LEN+1];

int main(int argc, char *argv[]) {
    int i;    
    for (i = 1; i < argc; i++) {
        strcat(command, argv[i]);
        strcat(command, " ");
    }
    printf("Executing command = %s...\n", command);
    system(command);
    printf("Done!\n");
    return 0;
}

Let's compile it both as Native Windows (exec_win) and as Cygwin-64 (exec_cyg) and let's run it from within Cygwin-64...

Maurizio@MMNASUS /cygdrive/c/smart/prog/provec
$ exec_win which ls
/usr/bin/ls
Executing command = which ls ...
Done!

Maurizio@MMNASUS /cygdrive/c/smart/prog/provec
$ exec_cyg which ls
Executing command = which ls ...
/usr/bin/ls
Done!

Maurizio@MMNASUS /cygdrive/c/smart/prog/provec
$ exec_win ls c:\\tmp
Executing command = ls c:\tmp ...
Done!

Maurizio@MMNASUS /cygdrive/c/smart/prog/provec
$ exec_cyg ls c:\\tmp
Executing command = ls c:\tmp ...
ls: cannot access 'c:tmp': No such file or directory
Done!

Collapse
Posted by Gustaf Neumann on
i have spent a lot of time about handling exec issues of tcl in aolserver and naviserver, which usually only happen under heavy usage, when the server have log uptimes, etc. I am just pointing out consequences of design decisions, which the person rewriting "exec" for application specific needs might not have been aware off.

The "right approach" is the one which reduces on the longer range the maintenance cost and reliability complaints....

Collapse
Posted by Frank Bergmann on
Hi!

reduces ... maintenance costs

That's a perfect business case 😊 Well, the point in ]po[ is that most "execs" are used in administrator functions like periodic integration with external system etc. To my knowledge there is no exec in operational procedures, except for the GrahpWiz "dot" calls in the acs-workflow package. This may be the reason why we didn't have any issues under high load etc. so far.

Concerning Maurizio's post:
I'm aware that there may be incompatibilities in the arguments of an exec command, and not only in the path to the binary. However, all CygWin binaries use the Linux "/" forward slash notation that is also accepted in TCL. So that is why we're on pretty save ground if we keep everything to the Linux standard.

proxy-procs.tcl

I haven't tested yet (easter holidays...), but I'm quite confident that I can fix our current issues with the knowledge about this new proxy::exec. I understand the arguments given by Gustaf in favor, and I believe this is great work. It might even be worth the pain to change the respective ]po[ code. I'll check and may come back here on the forum.

Thanks a lot for the great work!
Frank

Collapse
Posted by Frank Bergmann on
Hi Gustaf,

I'm back at the issues of "renaming exec" on Windows. I believe I've got a simple issue with TCL namespaces. This has nothing to do with nsproxy!

I'm frankly not motivated enough to re-write all this crud, I just want it to work like it was before.

And that's very simple: Commands are plain Linux commands ("/bin/pwd"). ]po[ relies on a complete CygWin installation, so that all commands are "there". Now we just need to add a "C:/project-open" in front of any Linux command in order for exec to work.

I've got the following code (simplified):

rename ::exec ::exec_orig
proc exec {args} { return [eval "::exec_orig C:/project-open$args"] }
ns_log Notice "exec: pwd=[exec "/bin/pwd"]"

This code works during NaviServer startup on Windows. However, it doesn't work when executing "exec /bin/pwd" from an OpenACS shell with the error messge "invalid command name "::exec_orig"". exec_orig somehow disappeared from [info commands].

Any idea?

Thanks!
Frank

Collapse
Posted by Gustaf Neumann on
a) Your code should use [pwd] and not [exec /bin/pwd]. This is faster and more portable.
b) if you want to find cmds on nonstandard places, why not use the PATH.
c) naviserver makes no attempt to preserve "renames" in the blueprint, since renames are destructive, and it is practically impossible to determine where during loading of the blueprint there is the right place to rename a command.
Collapse
Posted by Maurizio Martignano on
Dear Gustaf and Frank,
I will try to explain the current situation for the Windows port (at least my installer).

1. the renaming was implemented in an old file "acs-tcl\tcl\windows-procs.tcl".

2. At least in my distribution that file is disabled, by calling it "windows-procs.tcl.org".

3. So my distribution uses the standard "exec" (when the nsproxy module is not loaded, that is
ns_section ns/server/${server}/modules
ns_param nssock ${bindir}/nssock.dll
# ns_param nsproxy ${bindir}/nsproxy.dll
)

4. with this configuration, running for instance the command:

exec psql -U postgres -l

works and produces the following output:

List of databases
Name | Owner | Encoding | Collate | Ctype | Access privileges
--------------+----------+----------+-----------------------------+-----------------------------+-----------------------
kedrios | postgres | UTF8 | English_United Kingdom.1252 | English_United Kingdom.1252 |
machines | postgres | UTF8 | English_United Kingdom.1252 | English_United Kingdom.1252 |
machines_new | postgres | UTF8 | English_United Kingdom.1252 | English_United Kingdom.1252 |
postgres | postgres | UTF8 | English_United Kingdom.1252 | English_United Kingdom.1252 |
projop | postgres | UTF8 | English_United Kingdom.1252 | English_United Kingdom.1252 |
sonar | postgres | UTF8 | English_United Kingdom.1252 | English_United Kingdom.1252 |
template0 | postgres | UTF8 | English_United Kingdom.1252 | English_United Kingdom.1252 | =c/postgres +
| | | | | postgres=CTc/postgres
template1 | postgres | UTF8 | English_United Kingdom.1252 | English_United Kingdom.1252 | =c/postgres +
| | | | | postgres=CTc/postgres
(8 rows)

5. the nsproxy module only recently (yesterday) got compilable under Windows and it is available in my distribution from today. With that module loaded, that is:
ns_section ns/server/${server}/modules
ns_param nssock ${bindir}/nssock.dll
ns_param nsproxy ${bindir}/nsproxy.dll
the same psql command given in the ds shell produces the following output:
ERROR:
exec failed: no such file or directory
while executing
"ns_proxy get ExecPool"
(procedure "::nsf::procs::proxy::exec" line 3)
invoked from within
"proxy::exec -call $args"

In fact also nsproxy renames the original exec.

Hope it helps,
Maurizio

Collapse
Posted by Gustaf Neumann on
Dear Maurizio,

i have yesterday invested time to write the additional code to make the nsproxy module compile under windows. Support for nsproxy for win has been requested in the past several times. However, as i wrote in my mail to you, i've just invested time to implement the missing functions under windows, but i did not dig into the nmake logic to produce the binaries, so i asked you to test it....

The error message "no such file or directory" might indicate, that there is no nsproxy binary in the naviserver bin directory. Under windows, there should be a nsproxy.exe and a nsproxy.dll. Do you have these?

The "exec" definition in OpenACS does not rename "exec", but it overloads it. That makes it problematic, if one tries to rename it in addition.

Collapse
Posted by Maurizio Martignano on
Dear Gustaf,
I believe we are talking in circles.
You wrote:
"Under windows, there should be a nsproxy.exe and a nsproxy.dll. Do you have these?".
The only important outputs of the Windows build process are nsd.exe and all modules *.dll files.
Executables like nssock.exe, or nsproxi.exe, etc.. are the intermediate build products created/used by the legacy nmake "makefiles".
And of course I had the nsproxy.dll in place. When you specify in the config file that you need to load a module (a *.dll) if it doesn't find it, it complains.
You also wrote
"The "exec" definition in OpenACS does not rename "exec", but it overloads it. That makes it problematic, if one tries to rename it in addition".
What I can see in the files is that both windows-procs.tcl (line 22) and proxy-procs.tcl (line 40) call a rename.
And what I can say is the following:
1. in my configuration/installer I do not load the windows-procs.tcl file, so that renaming does not take place.
2. when I use the system without loading nsproxy.dll the exec command works and looks for binaries in the specified PATH.
3. when I use the system with nsproxy.dll loaded the exec command does not work (and a renaming takes place).
4. Frank does not need nsproxy, I mean in the context of this discussion. He only needs a "new" exec to have control on where the system looks for binaries. So he can probably take the "new" exec in the windows-procs.tcl and modify it as he requires/pleases.

Hope it helps,
Maurizio

Collapse
Posted by Gustaf Neumann on
Maurizio: The only important outputs of the Windows build process are nsd.exe and all modules *.dll files.

If this is the case, the windows build is incomplete. In an installation including nsproxy, there should be three binaries in the ns/bin directory:
- ns/bin/nsd.exe
- ns/bin/nsproxy.exe
- ns/bin/nsthreadtest.exe
On unixoid platforms the filename do not contain the .exe suffix. The last one is for testing.

It should not be complicated to produce an nsproxy executable under windows. As a reference, the nsproxy executable program is linked as follows:

gcc -L../nsthread -L../nsd -L../nsdb -o nsproxy  nsproxy.o libnsproxy.dylib -lnsthread -lnsd -L/usr/local/ns/lib -ltcl8.6     -prebind -headerpad_max_install_names -Wl,-search_paths_first  -L/usr/local/ns/lib -L/opt/local/lib -lssl -lcrypto
The nmake-file for windows should be extended to contain a similar build rule.

Concerning the "renaming" issues: Where are windows-procs.tcl coming from? This is not part of OpenACS. I was talking about OpenACS, where "exec" is redefined in acs-tcl/tcl/proxy-procs.tcl

There are certainly many ways to address to "exec" problem in Frank's case. My recommendation is to stay away from renaming and to aim for a solution, which is as close as possible to the maintained part of OpenACS. The advantages of nsproxy (probably incomplete)
- when nsproxy workers are started early, no memory bloat on execs (important, when running on machines with little memory)
- address the race-conditions, when a tcl script changes the working directory and issues a command in this directory
- allows to specify environment variables for the exec environment that can be different to those of the main process
- allows to set timeouts and many more to improve resiliance

But maybe, it is not worth the effort to get nsproxy running under windows.

Collapse
Posted by Frank Bergmann on
Hi!

Short update: We are on the way to replace all exec ocurrences in ]po[ with "im_exec", which will include the Windows specific logic. So there is no "rename" anymore.

So I have moved the content from acs-tcl/tcl/windows-procs.tcl into this im_exec routine.

im_exec in turn calls exec, after some transformations of the arguments into a format that works with CygWin.

Cheers,
Frank