Forum OpenACS Q&A: Table "Users" corrupted?

Collapse
Posted by Taka Chan on
I have found the following error at server log:

[20/Jul/2001:17:38:08][8622.1026][-sched-] Notice: Replicating Web
Robots DB, because data in robots table has expired
[20/Jul/2001:17:38:08][8622.4101][-conn0-] Error: Ns_PgExec: result
status: 7 message: ERROR:  RelationBuildTriggers: 1 record(s) not
found for rel users

[20/Jul/2001:17:38:08][8622.4101][-conn0-] Error: dbinit: error
(localhost::web,ERROR:  RelationBuildTriggers: 1 record(s) not found
for rel users
): '
    insert into sec_sessions(session_id, user_id, token,
secure_token, browser_id,
    last_ip, last_hit)
    values(20445, null, 'i1V6RvzfNDt6AT5oiyDlHMiSxDsEijsQ', '', 6669,
    '202.181.209.106', 995621888)
    '
[20/Jul/2001:17:38:08][8622.4101][-conn0-] Error: Filter
sec_read_security_info returned error #1: Database operation "dml"
failed
[20/Jul/2001:17:38:08][8622.4101][-conn0-] Error: tclop: invalid
return code from filter proc 'Critical filter sec_read_security_info
failed.': must be filter_ok, filter_return, or filter_break

I have tried to login to Postgres interactive terminal, and
type "select * from users".  It returns the following message:

RelationBuildTriggers: 1 record(s) not found for rel users

Then I tried to list out all tables, and the table "users" is still
here, and I can also retrieve the schema of the table by typing "d
users".

What is the problem?  Anyone can help?  Thanks

Collapse
Posted by Don Baccus on
I've never seen this - what version of Postgres are you using?  If it is PG 7.1 and if you can afford to let your system sit in this broken state for a bit and are willing to do a little work to help track down the bug, e-mail Tom Lane at mailto:tgl@sss.pgh.pa.us and describe the problem and offer your help.  He'll be able to figure out just what's causing the problem, and give you some specific advice for working around it.

Don't do this if you're using PG 7.0 or earlier.

If you can't wait and need to fix it ASAP, the first thing to try is to stop the postmaster and restart it.  Do this CORRECTLY i.e. via pg_ctl if you're running PG 7.1.

Collapse
Posted by Pascal Scheffers on
And if you cannot keep your db in this state: you can tar all data files (shutdown the postmaster first). They're in the directory pointed to by environment variable PGDATA, usually /usr/local/pgsql/data. They really appreciate bug reports :)
Collapse
Posted by Don Baccus on
Yes - good point.  Let me double-remphasize "SHUT DOWN THE POSTMASTER FIRST" if you're going to tar up the data files ...
Collapse
Posted by Taka Chan on
Unfortunately, I am using PostgreSQL 7.0.3, what can I do?
Collapse
Posted by Don Baccus on
You should upgrade to PG 7.1 - do you have a recent backup of the system?  You can try doing a pg_dump now to see if that will work.

Have you stopped the postmaster and restarted it to see if the problem goes away?  *don't* kill -9 it,  use kill -TERM.

Collapse
Posted by Taka Chan on
I have tried to stop it by using 'pg_ctl' and then restart the postmaster, but the problem still exists.

Moreover, I have seen a thread talking about that using pg_dump to export data and then import it again will cause the aolserver unstable.  And I also experienced it before.  I need to reload the datamodel and the insert statement of data separately.  Is there any method can export and then re-import the whole database that will not cause any problem?  If not, that would be a nightmare for me...

Besides, is the command "vacuum" helpful?

Collapse
Posted by Roberto Mello on
The problem is that the pg_dump in 7.0.x has a problem in dumping functions IIRC. What has worked for me (and many people), is a) making the pg_dump, b) upgrading to PG 7.1, c) loading the dump file, d) reloading postgres.sql (in the OpenACS datamodel).
Collapse
Posted by Don Baccus on
You should reverse the last two steps, i.e. load postgres.sql then reload the dump ...