test-doc - OpenACS Internationalization Requirements

I OpenACS For Everyone
- I.1 High level information: What is OpenACS?
  - I.1.1 Overview
  - I.1.2 OpenACS Release Notes
- I.2 OpenACS: robust web development framework
  - I.2.1 Introduction
  - I.2.2 Basic infrastructure
  - I.2.3 Advanced infrastructure
  - I.2.4 Domain level tools
II Administrator's Guide
- II.2 Installation Overview
  - II.2.1 Basic Steps
  - II.2.2 Prerequisite Software
- II.3 Complete Installation
  - II.3.1 Install a Unix-like system and supporting software
  - II.3.2 Install Oracle 10g XE on debian
    - II.3.2.1 Install Oracle 8.1.7
  - II.3.3 Install PostgreSQL
  - II.3.4 Install AOLserver 4
  - II.3.5 Quick Install of OpenACS
    - II.3.5.1 Complex Install OpenACS 5.3
  - II.3.6 OpenACS Installation Guide for Windows2000
  - II.3.7 OpenACS Installation Guide for Mac OS X
- II.4 Configuring a new OpenACS Site
  - II.4.1 Installing OpenACS packages
  - II.4.2 Mounting OpenACS packages
  - II.4.3 Configuring an OpenACS package
  - II.4.4 Setting Permissions on an OpenACS package
  - II.4.5 How Do I?
  - II.4.6 Configure OpenACS look and feel with templates
- II.5 Upgrading
  - II.5.1 Overview
  - II.5.2 Upgrading 4.5 or higher to 4.6.3
  - II.5.3 Upgrading OpenACS 4.6.3 to 5.0
  - II.5.4 Upgrading an OpenACS 5.0.0 or greater installation
  - II.5.5 Upgrading the OpenACS files
  - II.5.6 Upgrading Platform components
- II.6 Production Environments
  - II.6.1 Starting and Stopping an OpenACS instance.
  - II.6.2 AOLserver keepalive with inittab
  - II.6.3 Running multiple services on one machine
  - II.6.4 High Availability/High Performance Configurations
  - II.6.5 Staged Deployment for Production Networks
  - II.6.6 Installing SSL Support for an OpenACS service
  - II.6.7 Set up Log Analysis Reports
  - II.6.8 External uptime validation
  - II.6.9 Diagnosing Performance Problems
- II.7 Database Management
  - II.7.1 Running a PostgreSQL database on another server
  - II.7.2 Deleting a tablespace
  - II.7.3 Vacuum Postgres nightly
- II.8 Backup and Recovery
  - II.8.1 Backup Strategy
  - II.8.2 Manual backup and recovery
  - II.8.3 Automated Backup
  - II.8.4 Using CVS for backup-recovery
- II.A Install Red Hat 8/9
- II.B Install additional supporting software
  - II.B.1 Unpack the OpenACS tarball
  - II.B.2 Initialize CVS (OPTIONAL)
  - II.B.3 Add PSGML commands to emacs init file (OPTIONAL)
  - II.B.4 Install Daemontools (OPTIONAL)
  - II.B.5 Install qmail (OPTIONAL)
  - II.B.6 Install Analog web file analyzer
  - II.B.7 Install nspam
  - II.B.8 Install Full Text Search
  - II.B.9 Install Full Text Search using Tsearch2
  - II.B.10 Install Full Text Search using OpenFTS (deprecated see tsearch2)
  - II.B.11 Install nsopenssl
  - II.B.12 Install tclwebtest.
  - II.B.13 Install PHP for use in AOLserver
  - II.B.14 Install Squirrelmail for use as a webmail system for OpenACS
  - II.B.15 Install PAM Radius for use as external authentication
  - II.B.16 Install LDAP for use as external authentication
  - II.B.17 Install AOLserver 3.3oacs1
- II.C Credits
  - II.C.1 Where did this document come from?
  - II.C.2 Linux Install Guides
  - II.C.3 Security Information
  - II.C.4 Resources
III For OpenACS Package Developers
- III.9 Development Tutorial
  - III.9.1 Creating an Application Package
  - III.9.2 Setting Up Database Objects
  - III.9.3 Creating Web Pages
  - III.9.4 Debugging and Automated Testing
- III.10 Advanced Topics
  - III.10.1 Write the Requirements and Design Specs
  - III.10.2 Add the new package to CVS
  - III.10.3 OpenACS Edit This Page Templates
  - III.10.4 Adding Comments
  - III.10.5 Admin Pages
  - III.10.6 Categories
  - III.10.7 Profile your code
  - III.10.8 Prepare the package for distribution.
  - III.10.9 Distributing upgrades of your package
  - III.10.10 Notifications
  - III.10.11 Hierarchical data
  - III.10.12 Using .vuh files for pretty urls
  - III.10.13 Laying out a page with CSS instead of tables
  - III.10.14 Sending HTML email from your application
  - III.10.15 Basic Caching
  - III.10.16 Scheduled Procedures
  - III.10.17 Enabling WYSIWYG
  - III.10.18 Adding in parameters for your package
  - III.10.19 Writing upgrade scripts
  - III.10.20 Connect to a second database
  - III.10.21 Future Topics
- III.11 Development Reference
  - III.11.1 OpenACS Packages
  - III.11.2 OpenACS Data Models and the Object System
  - III.11.3 The Request Processor
  - III.11.4 The OpenACS Database Access API
  - III.11.5 Using Templates in OpenACS
  - III.11.6 Groups, Context, Permissions
  - III.11.7 Writing OpenACS Application Pages
  - III.11.8 Parties in OpenACS
  - III.11.9 OpenACS Permissions Tediously Explained
  - III.11.10 Object Identity
  - III.11.11 Programming with AOLserver
  - III.11.12 Using Form Builder: building html forms dynamically
- III.12 Engineering Standards
  - III.12.1 OpenACS Style Guide
  - III.12.2 Release Version Numbering
  - III.12.3 Constraint naming standard
  - III.12.4 ACS File Naming and Formatting Standards
  - III.12.5 PL/SQL Standards
  - III.12.6 Variables
  - III.12.7 Automated Testing
- III.13 CVS Guidelines
  - III.13.1 Using CVS with OpenACS
  - III.13.2 OpenACS CVS Concepts
  - III.13.3 Contributing code back to OpenACS
  - III.13.4 Additional Resources for CVS
- III.14 Documentation Standards
  - III.14.1 OpenACS Documentation Guide
  - III.14.2 Using PSGML mode in Emacs
  - III.14.3 Using nXML mode in Emacs
  - III.14.4 Detailed Design Documentation Template
  - III.14.5 System/Application Requirements Template
- III.15 TCLWebtest
  - III.15.1 API test
  - III.15.2 Webtest
- III.16 Internationalization
  - III.16.1 Internationalization and Localization Overview
  - III.16.2 How Internationalization/Localization works in OpenACS
  - III.16.4 Design Notes
  - III.16.5 Translator's Guide
- III.D Using CVS with an OpenACS Site
IV For OpenACS Platform Developers
- IV.17 Kernel Documentation
  - IV.17.1 Overview
  - IV.17.2 Object Model Requirements
  - IV.17.3 Object Model Design
  - IV.17.4 Permissions Requirements
  - IV.17.5 Permissions Design
  - IV.17.6 Groups Requirements
  - IV.17.7 Groups Design
  - IV.17.8 Subsites Requirements
  - IV.17.9 Subsites Design Document
  - IV.17.10 Package Manager Requirements
  - IV.17.11 Package Manager Design
  - IV.17.12 Database Access API
  - IV.17.13 OpenACS Internationalization Requirements
  - IV.17.14 Security Requirements
  - IV.17.15 Security Design
  - IV.17.16 Security Notes
  - IV.17.17 Request Processor Requirements
  - IV.17.18 Request Processor Design
  - IV.17.19 Documenting Tcl Files: Page Contracts and Libraries
  - IV.17.20 Bootstrapping OpenACS
  - IV.17.21 External Authentication Requirements
- IV.18 Releasing OpenACS
  - IV.18.1 OpenACS Core and .LRN
  - IV.18.2 How to Update the OpenACS.org repository
  - IV.18.3 How to package and release an OpenACS Package
  - IV.18.4 How to Update the translations
V Tcl for Web Nerds
- V.1 Tcl for Web Nerds Introduction
- V.2 Basic String Operations
- V.3 List Operations
- V.4 Pattern matching
- V.5 Array Operations
- V.6 Numbers
- V.7 Control Structure
- V.8 Scope, Upvar and Uplevel
- V.9 File Operations
- V.10 Eval
- V.11 Exec
- V.12 Tcl for Web Use
- V.13 OpenACS conventions for TCL
- V.14 Solutions
VI SQL for Web Nerds
- VI.1 SQL Tutorial
  - VI.1.1 SQL Tutorial
  - VI.1.2 Answers
- VI.2 SQL for Web Nerds Introduction
- VI.3 Data modeling
  - VI.3.1 The Discussion Forum -- philg's personal odyssey
  - VI.3.2 Data Types (Oracle)
  - VI.3.4 Tables
  - VI.3.5 Constraints
- VI.4 Simple queries
- VI.5 More complex queries
- VI.6 Transactions
- VI.7 Triggers
- VI.8 Views
- VI.9 Style
- VI.10 Escaping to the procedural world
- VI.11 Trees

78.10%

· Index

IV.17.13 OpenACS Internationalization Requirements

by Henry Minsky, Yon Feldman, Lars Pind, Peter Marklund, Christian Hvid, and others.

OpenACS docs are written by the named authors, and may be edited by OpenACS documentation staff.

This document describes the requirements for functionality in the OpenACS platform to support globalization of the core and optional modules. The goal is to make it possible to support delivery of applications which work properly in multiple locales with the lowest development and maintenance cost.

internationalization (i18n): The provision within a computer program of the capability of making itself adaptable to the requirements of different native languages, local customs and coded character sets.
locale: The definition of the subset of a user's environment that depends on language and cultural conventions.
localization (L10n): The process of establishing information within a computer system specific to the operation of particular native languages, local customs and coded character sets.
globalization: A product development approach which ensures that software products are usable in the worldwide markets through a combination of internationalization and localization.

The Mozilla project suggests keeping two catchy phrases in mind when thinking about globalization:

One code base for the world
English is just another language

Building an application often involves making a number of assumptions on the part of the developers which depend on their own culture. These include constant strings in the user interface and system error messages, names of countries, cities, order of given and family names for people, syntax of numeric and date strings and collation order of strings.

The OpenACS should be able to operate in languages and regions beyond US English. The goal of OpenACS Globalization is to provide a clean and efficient way to factor out the locale dependent functionality from our applications, in order to be able to easily swap in alternate localizations.

This in turn will reduce redundant, costly, and error prone rework when targeting the toolkit or applications built with the toolkit to another locale.

The cost of porting the OpenACS to another locale without some kind of globalization support would be large and ongoing, since without a mechanism to incorporate the locale-specific changes cleanly back into the code base, it would require making a new fork of the source code for each locale.

A globalized application will perform some or all of the following steps to handle a page request for a specific locale:

Decide what the target locale is for an incoming page request
Decide which character set encoding the output should be delivered in
If a script file to handle the request needs to be loaded from disk, determine if a character set conversion needs to be performed when loading the script
If needed, locale-specific resources are fetched. These can include text, graphics, or other resources that would vary with the target locale.
If content data is fetched from the database, check for locale-specific versions of the data (e.g. country names).
Source code should use a message catalog API to translate constant strings in the code to the target locale
Perform locale-specific linguistic sorting on data if needed
If the user submitted form input data, decide what character set encoding conversion if any is needed. Parse locale-specific quantities if needed (number formats, date formats).
If templating is being used, select correct locale-specific template to merge with content
Format output data quantities in locale-specific manner (date, time, numeric, currency). If templating is being used, this may be done either before and/or after merging the data with a template.

Since the internationalization APIs may potentially be used on every page in an application, the overhead for adding internationalization to a module or application must not cause a significant time delay in handling page requests.

In many cases there are facilities in Oracle to perform various localization functions, and also there are facilities in Java which we will want to move to. So the design to meet the requirements will tend to rely on these capabilities, or close approximations to them where possible, in order to make it easier to maintain Tcl and Java OpenACS versions.

Here are the cases that we need to be able to handle efficiently:

A developer needs to author a web site/application in a language besides English, and possibly a character set besides ISO-8859-1. This includes the operation of the OpenACS itself, i.e., navigation, admin pages for modules, error messages, as well as additional modules or content supplied by the web site developer.

What do they need to modify to make this work? Can their localization work be easily folded in to future releases of OpenACS?
A developer needs to author a web site which operates in multiple languages simultaneously. For example, www.un.org with content and navigation in multiple languages.

The site would have an end-user visible UI to support these languages, and the content management system must allow articles to be posted in these languages. In some cases it may be necessary to make the modules' admin UI's operate in more than one supported language, while in other cases the backend admin interface can operate in a single language.
A developer is writing a new module, and wants to make it easy for someone to localize it. There should be a clear path to author the module so that future developers can easily add support for other locales. This would include support for creating resources such as message catalogs, non-text assets such as graphics, and use of templates which help to separate application logic from presentation.

Other application servers: ATG Dyanmo, Broadvision, Vignette, ... ? Anyone know how they deal with i18n ?

System/Package "coversheet" - where all documentation for this software is linked off of
Design document
Developer's guide
User's guide
Other-cool-system-related-to-this-one document

LI18NUX 2000 Globalization Specification: http://www.li18nux.net/

Mozilla i18N Guidelines: http://www.mozilla.org/docs/refList/i18n/l12yGuidelines.html

ISO 639:1988 Code for the representation of names of languages http://sunsite.berkeley.edu/amher/iso_639.html

ISO 3166-1:1997 Codes for the representation of names of countries and their subdivisions Part 1: Country codes http://www.niso.org/3166.html

IANA Registry of Character Sets
Test plan
Competitive system(s)

Because the requirements for globalization affect many areas of the system, we will break up the requirements into phases, with a base required set of features, and then stages of increasing functionality.

10.0

A standard representation of locale will be used throughout the system. A locale refers to a language and territory, and is uniquely identified by a combination of ISO language and ISO country abbreviations.

See Content Repository Requirement 100.20

10.10 Provide a consistent representation and API for creating and referencing a locale

10.20 There will be a Tcl library of locale-aware formatting and parsing functions for numbers, dates and times. Note that Java has builtin support for these already.

10.30 For each locale there will be default date, number and currency formats. Currency i18n is NOT IMPLEMENTED for 5.0.0.

10.40Administrators can upgrade their servers to use new locales via the APM. NOT IMPLEMENTED in 5.0.0; current workaround is to get an xml file and load it manually.

20.0

The request processor must have a mechanism for associating a locale with each request. This locale is then used to select the appropriate template for a request, and will also be passed as the locale argument to the message catalog or locale-specific formatting functions.

20.10 The locale for a request should be computed by the following method, in descending order of priority:

get locale associated with subsite or package id

get locale from user preference

get locale from site wide default

20.20 An API will be provided for getting the current request locale from the ad_conn structure.

30.0

A mechanism must be provided for a developer to group a set of arbitrary content resources together, keyed by a unique identifier and a locale.

For example, what approaches could be used to implement a localizable nav-bar mechanism for a site? A navigation bar might be made up of a set of text strings and graphics, where the graphics themselves are locale-specific, such as images of English or Japanese text (as on www.un.org). It should be easy to specify alternate configurations of text and graphics to lay out the page for different locales.

Design note: Alternative mechanisms to implement this functionality might include using templates, Java ResourceBundles, content-item containers in the Content Repository, or some convention assigning a common prefix to key strings in the message catalog.

40.0

A message catalog facility will provide a database of translations for constant strings for multilingual applications. It must support the following:

40.10 Each message will referenced via unique a key.

40.20 The key for a message will have some hierarchical structure to it, so that sets of messages can be grouped with respect to a module name or package path.

40.30 The API for lookup of a message will take a locale and message key as arguments, and return the appropriate translation of that message for the specifed locale.

40.40 The API for lookup of a message will accept an optional default string which can be used if the message key is not found in the catalog. This lets the developer get code working and tested in a single language before having to initialize or update a message catalog.

40.50 For use within templates, custom tags which invoke the message lookup API will be provided.

40.60 Provide a method for importing and exporting a flat file of translation strings, in order to make it as easy as possible to create and modify message translations in bulk without having to use a web interface.

40.70 Since translations may be in different character sets, there must be provision for writing and reading catalog files in different character sets. A mechanism must exist for identifying the character set of a catalog file before reading it.

40.80 There should be a mechanism for tracking dependencies in the message catalog, so that if a string is modified, the other translations of that string can be flagged as needing update.

40.90 The message lookup must be as efficient as possible so as not to slow down the delivery of pages.

Character Sets

50.0 A locale will have a primary associated character set which is used to encode text in the language. When given a locale, we can query the system for the associated character set to use.

The assumption is that we are going to use Unicode in our database to hold all text data. Our current programming environments (Tcl/Oracle or Java/Oracle) operate on Unicode data internally. However, since Unicode is not yet commonly used in browsers and authoring tools, the system must be able to read and write other character sets. In particular, conversions to and from Unicode will need to be explicitly performed at the following times:

Loading source files (.tcl or .adp) or content files from the filesystem
Accepting form input data from users
Delivering text output to a browser
Composing an email message
Writing data to the filesystem

Acs-templating does the following.

When the acs-templating package opens an an ADP or TCL file, it assumes the file is iso-8859-1. If the output charset (OutputCharset) in the AOLserver config file is set, then acs-templating assumes it's that charset. Writing Files
When the acs-templating package writes an an ADP or TCL file, it assumes the file is iso-8859-1. If the output charset (OutputCharset) in the AOLserver config file is set, then acs-templating assumes it's that charset.

There are two classes of Tcl files loaded by the system; library files loaded at server startup, and page script files, which are run on each page request.

Should we require all Tcl files be stored as UTF8? That seems too much of a burden on developers.

50.10 Tcl library files can be authored in any character set. The system must have a way to determine the character set before loading the files, probably from the filename.

50.20 Tcl page script files can be authored in any character set. The system must have a way to determine the character set before loading the files, probably from the filename.

50.30 Data which is submitted with a HTTP request using a GET or POST method may be in any character set. The system must be able to determine the encoding of the form data and convert it to Unicode on demand.

50.35 The developer must be able to override the default system choice of character set when parsing and validating user form data. INCOMPLETE - form widgets in acs-templating/tcl/date-procs.tcl are not internationalized. Also, acs-templating's UI needs to be internationalized by replacing all user-visible strings with message keys.

50.30.10In Japan and some other Asian languages where there are multiple character set encodings in common use, the server may need to attempt to do an auto-detection of the character set, because buggy browsers may submit form data in an unexpected alternate encoding.

50.40 The output character set for a page request will be determined by default by the locale associated with the request (see requirement 20.0).

50.50 It must be possible for a developer to manually override the output character set encoding for a request using an API function.

60.10 All OpenACS error messages must use the message catalog and the request locale to generate error message for the appropriate locale.NOT IMPLEMENTED for 5.0.0.

60.20 Web server error messages such as 404, 500, etc must also be delivered in the appropriate locale.

60.30 Where files are written or read from disk, their filenames must use a character set and character values which are safe for the underlying operating system.

70.0 For a given abstract URL, the designer may create multiple locale-specific template files may be created (one per locale or language)

70.10 For a given page request, the system must be able to select an approprate locale-specific template file to use. The request locale is computed as per (see requirement 20.0).

70.20A template file may be created for a partial locale (language only, without a territory), and the request processor should be able to find the closest match for the current request locale.

70.30 A template file may be created in any character set. The system must have a way to know which character set a template file contains, so it can properly process it.

70.50 The properties of a datasource column may include a datatype so that the templating system can format the output for the current locale. The datatype is defined by a standard OpenACS datatype plus a format token or format string, for example: a date column might be specified as 'current_date:date LONG,' or 'current_date:date "YYYY-Mon-DD"'

70.60 The forms API must support construction of locale-specific HTML form widgets, such as date entry widgets, and form validation of user input data for locale-specific data, such as dates or numbers. NOT IMPLEMENTED in 5.0.0.

70.70 For forms which allow users to upload files, a standard method for a user to indicate the charset of a text file being uploaded must be provided.

Design note: this presumably applies to uploading data to the content repository as well

80.10 Support API for correct collation (sorting order) on lists of strings in locale-dependent way.

80.20 For the Tcl API, we will say that locale-dependent sorting will use Oracle SQL operations (i.e., we won't provide a Tcl API for this). We require a Tcl API function to return the correct incantation of NLS_SORT to use for a given locale with ORDER BY clauses in queries.

80.40 The system must handle full-text search in any supported language.

90.10 Provide API support for specifying a time zone

90.20 Provide an API for computing time and date operations which are aware of timezones. So for example a calendar module can properly synchronize items inserted into a calendar from users in different time zones using their own local times.

90.30 Store all dates and times in universal time zone, UTC.

90.40 For a registered users, a time zone preference should be stored.

90.50 For a non-registered user a time zone preference should be attached via a session or else UTC should be used to display every date and time.

90.60 The default if we can't determine a time zone is to display all dates and times in some universal time zone such as GMT.

100.10 Since UTF8 strings can use up to three (UCS2) or six (UCS4) bytes per character, make sure that column size declarations in the schema are large enough to accomodate required data (such as email addresses in Japanese). Since 5.0.0, this is covered in the database install instructions for both PostgreSQL and Oracle.

When sending an email message, just as when delivering the content in web page over an HTTP connection, it is necessary to be able to specify what character set encoding to use.

110.10 The email message sending API will allow for a character set encoding to be specified.

110.20 The email accepting API will allow for character set to be parsed correctly (hopefully a well formatted message will have a MIME character set content type header)

Mail is not internationalized. The following issues must be addressed.

Six different functions currently call ns_sendmail. This means that there are six different end points for sending mail. This should be brought down to no more than two (one for acs_mail and one for acs_mail_lite), and ideally just one. Functions that currently call ns_sendmail directly should instead call acs_mail_lite.
Outgoing email functions (acs_mail and acs_mail_lite) must do the following: 1) Determine the appropriate language or languages to use for the message subject and message body. 2) Encode the subject and body appropriately and set message headers, in accordance with RFC 3282 (http://www.ietf.org/rfc/rfc3282.txt) and other RFCs.
Extreme Use case: Web site has a default language of Danish. A forum is set up for Swedes, so the forum has a package_id and a language setting of Swedish. A poster posts to the forum in Russian (is this possible?). A user is subscribed to the forum and has a language preference of Chinese. What should be in the message body and message subject? INCOMPLETE - The mail functions in acs_mail and acs_mail_lite are not internationalized.
Incoming mail should be localized.

Because globalization touches many different parts of the system, we want to reduce the implementation risk by breaking the implementation into phases.

Document Revision #	Action Taken, Notes	When?	By Whom?
1	Updated with results of MIT-sponsored i18n work at Collaboraid.	14 Aug 2003	Joel Aufrecht
0.4	converting from HTML to DocBook and importing the document to the OpenACS kernel documents. This was done as a part of the internationalization of OpenACS and .LRN for the Heidelberg University in Germany	12 September 2002	Peter Marklund
0.3	comments from Christian	1/14/2000	Henry Minsky
0.2	Minor typos fixed, clarifications to wording	11/14/2000	Henry Minsky
0.1	Creation	11/08/2000	Henry Minsky