Forum OpenACS Q&A: Re: Re: Assessment - urgent

Collapse
Posted by Peter Alberer on
Rocael, could you find out WHY exactly those deadlocks are happening in the assessment package? I experienced similar problems but am not really sure why the deadlocks are appearing...
Collapse
Posted by Carl Robert Blesius on
Peter, we discovered the deadlock problem over the past few days as well and I asked Solution Grove to look into it. Looks like Roel was able to find a work-around yesterday (he added a while loop that will retry if it comes up):

http://xarg.net/tools/cvs/change-set-details?key=22801

Below is info I found in the Postgres Doc. Maybe Roel or Eduardo (the author in this case) could chime in the interest of incremental improvements? I am wondering if it had to do with a user double clicking or if it is more serious. I was actually doing some live usability testing when this came up on Monday and I think the person I was watching might have double clicked between sections (although she was not sure and I was not watching her hand).

Carl

"Use of explicit locking can cause deadlocks, wherein two (or more) transactions each hold locks that the other wants. For example, if transaction 1 acquires an exclusive lock on table A and then tries to acquire an exclusive lock on table B, while transaction 2 has already exclusive-locked table B and now wants an exclusive lock on table A, then neither one can proceed. PostgreSQL automatically detects deadlock situations and resolves them by aborting one of the transactions involved, allowing the other(s) to complete. (Exactly which transaction will be aborted is difficult to predict and should not be relied on.)

The best defense against deadlocks is generally to avoid them by being certain that all applications using a database acquire locks on multiple objects in a consistent order. One should also ensure that the first lock acquired on an object in a transaction is the highest mode that will be needed for that object. If it is not feasible to verify this in advance, then deadlocks may be handled on-the-fly by retrying transactions that are aborted due to deadlock.

So long as no deadlock situation is detected, a transaction seeking either a table-level or row-level lock will wait indefinitely for conflicting locks to be released. This means it is a bad idea for applications to hold transactions open for long periods of time (e.g., while waiting for user input)."

Collapse
Posted by Peter Alberer on

I have also thought about double clicking as the cause of the problem, but i am not so sure now. Prior to our last big evaluation, i have added the following to the submit button in
packages/assessment/www/assessment-section-submit.adp:

input type=button value="assessment.Submit" onclick="this.form.submit(); this.disabled=true; this.value='Verarbeitung läuft...'; return true;"

The javascript code disables the submit button and the user cannot easily submit the form twice (maybe with the return button on the keyboard). Despite this change, we had the deadlocks appearing :(

Collapse
37: Deadlocks (response to 32)
Posted by Roel Canicula on
The deadlocks can still happen if two or more people access the same assessment at almost the same time (the double-click probably could have caused it if nobody else is accessing). I traced the deadlock to as::item_data::new which gets called for each question displayed in the assessment.

It easily deadlocked when I ran procs that called it concurrently and in a loop. I also saw deadlocks from as::session::new and as::session_data::new but less frequently.

At first we thought this was caused by nested transactions but I ran the procs without the transactions and they still deadlocked.

Dave suggested that it could be a problem with insert to view and that we try to replicate the problem with much simpler code to find out if the problem is with the CR itself. I'm going to try that later.

Collapse
Posted by Rocael Hernández Rizzardini on
Peter, I do know.
Is because the creation of CR items / revisions, which is heavily used in assessment. Also the db_transactions that locks the cr_item / revision tables are affecting as well.

Imagine you are serving an assessment with 20 questions, to 20 persons at the same time, and the creation of an new cr_item takes about 100 ms, that will end up in big delays to serve the request (if you increase the deadlock timeout) or you'll end having the deadlock. In this example, takes 2000 ms to create all the items of a given assessment, by 20 persons is about 40 seconds in total to create all the items requested at the same time, eventually you'll have deadlocks, and they will propagate among all the applications that might be using (creating / updating) cr_items / revisions at the same time.

Finally, intensive concurrent use of the CR is NOT scaling so far.

Collapse
Posted by Rocael Hernández Rizzardini on
The answer is to NOT use CR, specially when serving the assessment and for the answering, at least be able to turn it off.

Retry to process something is just good if you DO NOT even care having your users waiting long periods of time for the page to be served, something that you don't want specially when you aim to serve pages in less than 1 sec.