Forum OpenACS Q&A: AOLserver 4 not yet ready for prime time if you need SSL

Hi all,

This is just a quick heads up.  In our runup to the Sloan upgrade we encountered server wedges with AOLserver4 + nsopenssl 3.0 beta21.

The exact cause isn't yet known.  Yesterday the server wedged again and happily Dossy and Don were on IRC and able to help me poke around the wedged process a bit with gdb.

The upshot is that while AOLserver4 is considered quite stable by itself, AOLserver4 + SSL isn't yet production ready.

Copious details here:
https://openacs.org/irc/log/2004-08-11

In the short term we've reverted back to AOLserver3.  Once we've got the upgrade behind us, I'll set up a debug-compiled test version of the wedging environment, and maybe if we get lucky we'll be able to contribute some helpful QA back to the AOLserver and nsopenssl teams.

Many, many thanks to Dossy and Don for their help.  Onward!

Andrew

Thats spooky - I brought up a staging server for the customer to evaluate on linux with aols4 + nsopenssl beta21 and I'm getting sort of wedged myself. The development server running  aols4 + beta17 is fine but with this one I'm seeing nsd threads (processes in top) gobbling up resource but not actually doing anything. The server seems to be running underneath with scheduled procs firing ok and new nsd threads being created and destroyed. Eventually the runaways overload the system and I have to kill it and restart. So far I've done that 3 times today and theres only a handful of users.

Looking at your irc thread it appears to be very similar. I was hoping to be able to deploy on aols4 but with the live date approaching I may have to revert to aols3 :o(

  - Steve

I had a site on aolserver4 + beta21 until last night, when I reverted it back to 3.3 also.  It was doing the same thing Andrew and Steve describe.

Oddly, while beta17 worked for Steve, it was even worse for me.  It would get stuck in a loop spewing out error messages that could bring the log file up to 2 GB in a matter of minutes.  Quite impressive, actually. :)  It seemed to work ok otherwise but we didn't stay on it long enough to really know.

We encountered several problems with nsopenssl 3 beta 18 when we tried using it in production.  One was that 40bit browsers will crash it, see SF Bug #999089.  We would also find it getting into infinite loops and consuming the processor.  I added in a check that seems to cure that symptom although I don't know nsopenssl well enough to say if it is correct.  (I sent it to Scott and described on mailing list, but Scott's been busy.)

The changes are at http://www.jamierasmussen.com/download/nsopenssl.patch and this is on Win32.

Will OpenSSL 2.x work with AOLserver 4.0.7, as long as you don't use any of AOLserver 4.0's single process virtual hosting features (which did not exist in AOLserver 3.x)?
No, it doesn't compile. I admit I haven't spent any time trying to figure out why.
An interesting data point to come out of this: server restarts are about 2x faster in AOLserver4.
We have a 5.0.4 installation on our evaluation soon to be production server. AOLS4 is in /usr/local/aolserver-4/ which is symlinked to /usr/local/aolserver/.

Am I correct in thinking that its a trivial job to just install AOLS3 with nsopenssl2.1a in say /usr/local/aolserver-3/ and switch the symlink across?

I don't have to do anything else to OpenACS other than a couple of bits in the config.tcl and change the nsopenssl config to the old style do I?

If thats the case then I'll probably switch them to AOLS3 until the ssl issue is resolved. What is the downside to leaving them on AOLS3 (theres no requirment for virtual servers)?

  - Steve

Andrew: I'm glad you guys tracked this down. I'm having exactly the same problem (as far as I can tell).

My 'solution' was to use the etc/keepalive scripts to make sure the server is up and running. That way once the threads get all occupied, the server just restarts and then works again. It's a very poor solution, though, and not one I would use on a heavily trafficked server.

Collapse
10: ttrace (response to 7)
Posted by Andrew Piskorski on
From what Zoran said, AOLserver 4 restarts are even faster when using ttrace. :)
FYI - I have a (low-ish) traffic server that people interact with over SSL every day and have not seen a single server wedge in the 2 or 3 months the server has been running. It hasn't been running for that time non-stop, but I have never had to restart it due to a wedge or utilisation problem.

Versions:

nsd : aolserver_v40_r2
nsopenssl : v3_0beta19

Mark,
is there such a thing like nsopenssl : v3_0beta19?
All I found where v2.0, 2.1, 2.1a:

http://sourceforge.net/project/showfiles.php?group_id=3152&package_id=41599

Where can I get your copy?

Greetings,
Nima

Not sure about beta19 ... but the source I commonly use is Scott Goodwin.
http://www.scottg.net/webtools/aolserver/modules/nsopenssl/

He has beta17 for download

nsopenssl is available from SourceForge CVS.  There are changes and bugfixes in HEAD that are newer than the tarballs that are on Scott's site.  You can view the ChangeLog and most recent sources online, see for example:
http://cvs.sourceforge.net/viewcvs.py/aolserver/nsopenssl/ChangeLog?view=markup

The SourceForge bug tracker for AOLserver has some nsopenssl items that are not fixed in even the HEAD sources.

My version is from cvs also - it works well so i'm reticent to move to the HEAD branch. if it aint broke...
Hi Mark (and all others having experience with AOLServer4 and SSL),

we, University Heidelberg, want to upgrade to dotlrn 2.1 soon.
Right now, AOLServer (3.3ad13) and Oracle db (8.1.7) are located on same Sun Solaris engine, but we want to split front-end and db because of several severe performance problems, which seem hardly be caused by dotlrn-/acs-code.
So we installed AOLServer4 on a Linux (Suse) box and kept Sun for oracle, but before going into production, I have a question:
Is your AOLServer-instance still stable and does you/your users often make use of SSL connections?
Heidelberg is extensively using ssl, so I want to be sure, former problems with AOLServer4 are fixed...

Thanks in advance for a short answer.

Regards,
Martin

martin,

allthough i am not answering your question,
our setup might be an option for you in case there are still
problems with ns_ssl. On our setup, all user requests are
using SSL and we are using multiple aolserver 4.* in
production. However, we use the reverse proxy pound
on all incoming requests. Pound does the ssl-decoding,
and forwards the requests in plain HTTP to the backup
servers depending on the URL.

This configuration based on pound works without problems
on our server since more than 1.5 years, i have seen up to 300 concurrent pound connections.

So, the ssl issue - if it still exists - should not
be a show stopper...

greetings
-gustaf

I can second Gustav's experience with Pound and SSL. Running a similar setup I haven't seen any SSL related problems when Pound handles SSL requests.

I can contribute code to make SSL requests accepted by Pound transparent to OpenACS.

/Bart
the Code Mill

Bart, Being curious: what did you do to achieve transparency?

If i remember correctly, we just added in the
oacs code the following switch to rp_filter in the
request processor procs, such it returns the
forwarded-for address instead of the
proxy in an [ad_conn peeraddr].

We handle redirections in the proxy and use the
access logfile of pound instead of the aolserver.

-gustaf

if { [ns_config -bool ns/parameters ReverseProxyMode 0] } {
  set addr [lindex [ns_set iget [ns_conn headers] x-forwarded-for] end]
  if {[string length $addr] == 0} {
    set addr [ns_conn peeraddr]
  }
  ad_conn -set peeraddr $addr
} else {
  ad_conn -set peeraddr [ns_conn peeraddr]
}

Hi Gustaf, hi Bart,

thanks for this hint. Right now, I hope, we can just deal without a third tool in order to keep system as "simple" as possible. But it is good to know that there's another option.

Thanks!
/Martin

Gustav,

I let Pound add an extra X header when the client has a HTTPS connection to Pound. X-SSL-Request is set to true to indicate that the connection to the browser is secure.

The request processor and the security procs of OpenACS then treat the connection as if directly connection over HTTPS to AOLserver. Or in a schema:

HTTPS -> Pound -> HTTP + X-SSL-Request: true -> AOLserver

is treated the same as:

HTTPS -> AOLserver

For now the mods to OpenACS don't verify that the IP address of the request coming from Pound is indeed originating from a trusted proxy. A hacker knowing the (internal) IP address of the proxy could potentially spoof a secure connection. Cross-referencing the IP address against a (list of) trusted IP address(es), however, is trivial to add.

Did you know that AOLserver 4.x automatically records the X-Forwarded-For IP address?

Bart
the Code Mill

Hi,

Does anyone know if these problems have been fixed yet?

Also, what is Pound? not come across it before, can someone forward me a url?

Would a similar thing be possible using apache as the proxy server do you think?

Simon

Here is a link to Pound

You can also find some comments about Pound and AOLserver at Bart's site.

Pound might be a nice solution for a lot of things in the OACS.

talli

Hi Bart,

i was not aware that OACS treats secure connections differently. We only allow ip traffice to the backend
from trusted machines (e.g. the proxy) via firewall.
Since all oacs applications uses ad_conn peeraddr,
the mentioned fix was easy enough.

Concerning 4.x & forwarded-for: no, i was not aware
of this either. I once brought it up on the aolserver
list, but got the impression that people were mostly
objecting the idea, since it would in the general
case leave room for spoofing.

We strip from the pound configuration all incoming
X-Forwarded-For headers, together with the firewall
rules, i believe we are on the safe side.

-gustaf

Hi,

seeing no more (at least unix-related) bugs regarding nsopenssl on bug-tracker for AOLServer, I asked on SourceForge forum, if productional use of AOLServer & SSL is recommended or not.
I will inform you about any responses.

/Martin
Martin,

The sourceforge forum doesn't get a lot of traffic.  A better place to post and ask is the AOLserver mailing list.

http://aolserver.com/lists.php

I see on SourceForge that Dossy is on top of things and responded to you there.

We are using the latest in production and don't think we have any more issues.  We still get a few coredumps here and there, but I haven't had time to look into it in depth to see if it is related to SSL anymore.

Hi,

Dossy replied:
       As you pointed out, there are no open bugs in the tracker for nsopenssl on Unix/Linux platforms. The latest beta, nsopenssl v3.0 beta 23, is believed to be stable at this point and is the recommended version for production use.

If you do find any issues with it, please file a bug. Thanks.

-- Dossy

Hope she's right 😊. Right now, we've upgraded a test-server to dotLRN 2.1 using AOLServer40r2 and nsopenssl v3_0beta23, but in order to get latest fixes, we will upgrade to AOLServer40r8.

/Martin
Sloan has seen their nsd processes consuming a lot more memory with AOLserver 4, to the point where I believe Andrew suspects a memory leak.

I have also noticed that my own ACS sites I have on AOLserver 4 are consuming more memory than they used to, but it is on a smaller scale than Andrew reports so I can't really tell if I am seeing a memory leak or not.  These sites do not use .LRN, so it's not a completely fair comparison.

I have no idea whether this is related to SSL or not, but I wanted to mention it anyway as an FYI.  I believe Dossy (who is a he, btw :) is looking into it and I don't think it should stop you from upgrading, but it is something to keep an eye on.

Hi Charles, hi Janine,

thanks for sharing your experience and sorry for feminization  of Dossy...
Regarding possible memory (leakage) issues, it can't be amiss to have a 3.3 nsd in stand by.

Hi,

just a short report regarding our experience with AOLServer 4.0.8 and v3_0beta23:

On 21th October we moved AOLServer 4.0.8 from our "Sun" computer, which hosted Webserver and Oracle at that moment, to a linux box. The result was that most pages were at least twice faster than before (see https://openacs.org/forums/message-view?message_id=214343 for a few more details)

Regarding stability of SSL:
We also noticed that AOLServer 4.0.8 seems to need much more memory per thread and that there may be still some leakage issues. At least, when having maxthreads==minthreads, we saw a monotone increase of memory usage, sometimes being very close to or reaching swapping. Even with a very low number of connection threads it was a question of time.
So we configured AOLServer to kill its threads after about an hour by setting maxthreads=25, minthreads=20 and threadtimeout=3600. This seems to work stable (knocking on wood 😊 ).

I just saw that Dossy tagged nsopenssl "v3_0beta26", so maybe, we will try this one.

<blockquote> At least, when having maxthreads==minthreads, we saw a
monotone increase of memory usage, sometimes being very
close to or reaching swapping. Even with a very low
number of connection threads it was a question of time.
So we configured AOLServer to kill its threads after
about an hour by setting maxthreads=25, minthreads=20 and
threadtimeout=3600. This seems to work stable (knocking
on wood 😊 ).
</blockquote>

Interesting!  We're seeing memory growth too, even with beta26.  I've just modded our config file to match your settings.