Forum OpenACS Development: Re: NaviServer on Windows: [ns_info threads] returning unbalanced braces

Dear Gustaf,
SendFd in sockfile calls
(*sendProc)(sock, &iov, 1, timeoutPtr, flags)...
Well this is a ptr function, eventually the function WSASend(in nssock.c/264) gets called.
But this is an asynchronous call.
This is why, I had to put a delay (an ugly fix I know) immediately after the call inside SendFd.
Whithout this delay, the WSASend can still work even after the SendFd is finished, that is when the buffer "buf" does not exists any longer in the stack. And this is why a static buffer prevents the problem from occurring (of course at the cost of the function non being thread safe).
A proper solution would be using WASSend completion routines. I believe this could be the reason why we see memory corruption problems:
http://www.serverframework.com/asynchronousevents/2015/01/wsarecv-wsasend-and-thread-safety.html

This is design, coding error and it has nothing to do with the compiler, the compiler options and so on....

Hope it helps,
Maurizio

Errata-Corrige
1. In teory, "WSASend", called within "SockSend", is called with the last two parameters set to NULL should be blocking... so no problem should occur.
2. when replacing "WSASend" with a normal "send", "SockSend" works without problems
3. "WSASend" by "SockSend also by AOLserver.
Probably the behaviour fo the "WSASend" has changed with the latest version of Windows and/or Visual C/C++.

I will update the code consequently.

Maurizio

Dear Maurizio at al,

I've installed Visual Studio 2017 on a VM, and could reproduce the first error of this forums thread. As expected, there was a change in Visual Studio: with version 2015, Microsoft changed with the introduction of the Universal CRT the behavior of _vsnprintf, which is not longer identical to _vsnprintf, but complies now with the C99 standard.

This change in Visual Studio affects potentially many places in the code and explains, why i could not reproduce the mentioned problem with Visual Studio 11. Maybe this is also related to the problem with large files (which i still can't reproduce).

it is not unlikely, that there are still more fixes necessary when compiling with Visual Studio 2015 or newer, but at least the problem with the "unbalanced brackets" should be fixed.

The changes are committed to bitbucket.

Dear Gustaf,
thank you very much for having replicated the issue.
I'm not sure I agree completely with your explanation.
If you're explanation was correct how come the following small modification (which I believe is the memory bug I was referring too) in the file dstring.c,
// #ifdef _WIN32
// while (result == -1 && errno == ERANGE) {
// newLength = dsPtr->spaceAvl * 2;
//#else
if ((size_t)result >= bufLength) {
newLength = dsPtr->spaceAvl + (result - (int)bufLength);
//#endif
makes:
1. the problem with the unbalanced braces disappear
2. the odd characters in the log disappear
3. the erratic behaviour disappear?

Thank you,
Maurizio

I saw your fix, in the same file, in the same area... we can call it as we want....
1. a compilation issue
2. a matter of compliance with standards...

but at the end of the day it is a memory allocation problem causing wrong/erratic behaviour.

This does not solve the long file problem, which at the moment I "fixed" not using a delay, but replacing WSASend with a plain send. I do not know why the second works and the first one doesn't.

Thank you,
Maurizio

maurizio, does this mean, you see with this change still the "unbalanced braces" issue? If so, make sure, that in your local changes, you do not alter the definitions of naviserver in respect to snprintf/vsnprintf (as it was in the case with the changes you sent to me around April 2016).

Background: the committed change (not sure, if the snippet you posted is the same) affects the dstring infrastructure, which is all over the code to append to a string. A buffer overflow on such a string can have many weird effects. Up to Visual Studio 2015, the functions vsnprintf() and _vsnprintf() behaved in the same, microsoft specific way, namely returning -1 in case, the provided buffer was too small, not guaranteeing null termination.

With the introduction of Visual Studio 2015, the behavior of vsnprintf() changed to become c99 compliant, but __vsnprintf() stayed the same. the c99 standard states, that in cases where the buffer is too small, the function should return the number of bytes that would have been written. That is a very different behavior for recognizing buffer grow requests and buffer size management. The same change of behavior also affects e.g. snprintf()/_snprintf(). So, when using these functions, a change from a Visual Studio before version 2015 to a newer version requires code changes. Convince yourself by compiling with an earlier version of Visual Studio!

So far, i see no obvious connection between the dstring function and the WSASend() problem. So far, i could not reproduce this issue. I tried yesterday multiple times with requests for a 6+ MB file, but everything worked fine. I find it hard to believe, that the specification

if both lpOverlapped and lpCompletionRoutine are NULL, the socket in this function will be treated as a non-overlapped socket. (https://msdn.microsoft.com/en-us/library/windows/desktop/ms742203.aspx)
is not valid in newer versions.

If someone has instructions, how to reproduce the issue reliably, i would appreciate it. i can't test maurizio's binaries in my setup, since i don't have 10GB free on the c: partition, and i failed to extend the partition in the vm under mac os x; so i need probably a different, completely new setup, for which i have no time now).

Dear Gustaf,
I've tried all combinations... here's the result:

the original code was:

DWORD n1;
if (WSASend(sock, (LPWSABUF)bufs, nbufs, &n1, flags,
NULL, NULL) != 0) {
n1 = -1;
}
n = n1;

and it DOES NOT WOK

I then used the readability improvement modification you posted this morning at 09:57:40
DWORD bytesReceived;
int rc;

rc = WSASend(sock, (LPWSABUF)bufs, nbufs, &bytesReceived, flags,
NULL, NULL);

if (rc == 0) {
n = bytesReceived;
} else {
n = -1;
}
and it DOES WORK

My original fix was to use a plain send
# if _MSC_VER < 1900
DWORD n1;
if (WSASend(sock, (LPWSABUF)bufs, nbufs, &n1, flags,
NULL, NULL) != 0) {
n1 = -1;
}
n = n1;
#else
n = send(sock, (LPWSABUF)bufs[0].iov_base, bufs[0].iov_len, flags);

#endif

and it DOES WORK

I will include your modification in my distribution, though I cannot see any difference between the first two implementations, but the code is here if anyone wants to check.

Thank you very much,
Maurizio

Do i read your mail correctly that now all know issues are gone? That would be great.

Concerning the change around WSAsend(): The difference is not easy to spot but due to a type conversion, since n1 (type DWORD) was originally assigned -1 in error conditions. Since DWORD is defined as unsigned 32bit quantity, bad things happen when its value is assigned to a type of a different size and signedness (here: ssize_t; probably not an issue in 32bit versions). Interestingly, this type issue is not flagged by sonarsrv: the current version of nssock.c in sonarsrv is still the "old" code and shows 0 issues [1]. I've cleaned this code yesterday to prepare for debugging return-codes from WSASend().

Concerning send(): When send() is used as shown in your snippet, the code will produce incorrect results when nbufs <> 1. The reason for using WSAsend() instead of send() is that the former supports the more efficient scatter/gather interface.

all the best
-gn

[1] http://sonarsrv.spazioit.com/code/index?id=my%3Anaviserver#/my%3Anaviserver%3Ansd%2Fsock.c

Dear Gustaf,
I believe all issues we have been facing are gone.
They were caused by two errors:
1. memory allocation problem (in dstring.c)
2. bad handling of the WSASend call.

I do not know if you saw it or not. But looking into possible causes for these problems I noticed that EINTR is not handled in a generic way as EWOULDBLOCK and EINPROGRESS. Perhaps it would make sense to normalize also EINTR.

Thank you,
Maurizio