0.00%
Search · Index

Weblog Page

Filtered by category beginner, 11 - 20 of 42 Postings (all, summary)

Overview

Created by Gustaf Neumann, last modified by Malte Sussdorff 07 Aug 2007, at 04:18 PM

A good detailed, yet somewhat outdated article is written by Reuven Lerner for the Linux Journal. It gives a good introduction, but please read on !

OpenACS (Open Architecture Community System) is an advanced toolkit for building scalable, community-oriented web applications. If you're thinking of building an enterprise-level web application, OpenACS is a solid, scalable framework for building dynamic content driven sites.

OpenACS is a collection of pre-built applications and services that you can use to build your web site/application. Through a modular architecture, OpenACS has packages for user/groups management, content management, e-commerce, news, FAQs, calendar, forums, bug tracking, full-text searching, and much more.

OpenACS relies on AOLserver, the free, multithreaded, scalable, Tcl-enabled, web/application server used by America Online for most of its web sites, and a true ACID-compliant Relational Database Management System (RDBMS). Currently OpenACS supports PostgreSQL, an open source RDBMS, and Oracle and is easily extensible to other databases which support a comparable feature set.

The OpenACS toolkit is derived from the ArsDigita Community System (ACS). ArsDigita (now part of Red Hat, Inc.) kindly made their work available under the GPL, making all of this possible.

The OpenACS project was born when Don Baccus, Ben Adida, and others decided to port ACS from Oracle to PostgreSQL, thus making it a fully open-source solution. With OpenACS 4, Oracle and PostgreSQL support were combined in one code base and with OpenACS 5, support for internationalization and localization has been added.

A vibrant and productive community has sprung up around the OpenACS software and there are many volunteer contributors as well as a commercial companies able to provide support, hosting, and custom development. Many of the production users are actively funding and contributing work back to the project. Formal, consensus driven governance has been established (with semi-annual elections) which ensures the project serves the needs of it's constituents.

The OpenACS community would like to hear your comments and can help you in your endeavors with the system. Visit our web site and feel free to ask questions or provide feedback.

OpenACS: robust web development framework

Created by Anett Szabo, last modified by Anett Szabo 30 Jul 2007, at 01:58 PM

OpenACS: robust web development framework


Author: Rocael Hernández, Galileo University, Guatemala / OpenACS Core Team, roc@viaro.net
Author: Andrew Grumet, OpenACS Core Team, aegrumet@alum.mit.edu


Tcl/Tk 2005 Conference, Portland, Oregon


Abstract:


OpenACS is a full featured web development framework to create scalable applications
oriented to collaboration and online communities. Is in use by many big players such as
greenpeace.org or the e-learning platform of the MIT Sloan School of Management.
While the system is not trivial, here are explained some of the most interesting and still
relatively simple facilities that the framework provides. Everything from templating,
separating the code from the presentation, database interactions according to the
programming language, auto-documentation features, automated test engine,
internationalization, and many more are written in Tcl, which has shown to be extremely
powerful for writing the foundation logic and the application pages.


Advanced infrastructure

Created by Anett Szabo, last modified by Anett Szabo 30 Jul 2007, at 01:57 PM

3.1. Serving files: packages, instances, site-map, request processor


Like other Web environments OpenACS can serve familiar file types such as .html and
.gif files from a document root. The OpenACS standard document root is
$OACS_HOME/www. Put a file at $OACS_HOME/www/hello.html and it will appear
at http://yourserver.example.com/hello.html.
OpenACS can also run scripts that set up variables and display them in HTML-like
templates, and also embed templates within other templates via include and master/slave
tags. These topics are covered in the Template system section above. In the sections
below we explore OpenACS more advanced mechanisms for serving files.


3.1.1. Packages

OpenACS is modularized into a set of packages that can be found in the
$OACS_HOME/packages subdirectory. Each package may contain SQL scripts, Tcl
libraries and visible pages, as illustrated in the abbreviated directory layout below:

$OACS_HOME/
packages/
acs-admin/ # Core package.
acs-api-browser/ # Core package.
...
forums/ # Forums package.
catalog/ # i18n message catalogs.
forums.info # Package specification file.
lib/ # Re-usable tcl/adp templates.
sql/ # Data model scripts.
tcl/ # Tcl library.
www/ # Package document root.
forum-view.tcl
forum-view.adp
...
www/ # Default document root.

This example draws attention to the forums package, one of dozens of application
packages available for use with OpenACS. Other available packages include a Webbased
files storage system (which also is WebDAV-enabled), calendaring, blog
authoring, assessment, news aggregation, wikis, photo galleries, RSS support, XMLRPC,
SOAP support and many more. A full list of packages can be browsed at
http://cvs.openacs.org/cvs/openacs-4/packages/.
Packages are managed with the OpenACS package manager, which handles upgrades and
tracks versions, dependencies and files much like Linux package managers do.
A view of the Package Manager:


3.1.2. Site map, package instances, request processor

Each package can have its own document root that functions like the default document
root at $OACS_HOME/www. The document root for the forums package is located at
$OACS_HOME/packages/forums/www, as illustrated in the abbreviated directory layout
above.
Package document roots are mapped to visible URLs through a set of database tables,
configuration data, libraries and administration Web pages known collectively as the site
map. Using the site map we can map $OACS_HOME/packages/forums/www to, for
example, http://yourserver.example.com/tclers-forums/. But it gets more interesting,
because the site map allows for re-use of packages at multiple URLs. Hence we can host
a discussion forum for C programmers by adding a new site map entry that maps the
forums package to http://yourserver.example.com/cprogrammers-forums/.
These mappings are referred to in OpenACS-speak as “package instances”. As the
terminology hints, the mapping to /tclers-forums has distinct configuration data from the
mapping to /cprogrammers-forums. Hence the /tclers-forums instance might contain a
Tcl Forum, a Tk Forum and an AOLserver Forum, while the /cprogrammers-forums
instance contains a Small and Fast Forum and a Some Compilation Required Forum.
Because package instances are OpenACS objects, they can be have different permission
settings, so that some users may be able to read and post to the /tclers-forums but not the
/cprogrammers-forums, and vice-versa. The OpenACS object system will be covered in
more detail below.
Before doing that, let’s take a brief diversion into the mechanics of how files are served,
Requests for OpenACS Web pages pass through a Request Processor, which is a global
filter and set of Tcl procs that respond to every incoming URL reaching the server. The
following diagram summarizes the stages of the request processor assuming a URL
request like http://yourserver.example.com/notes/somepage.adp.
The stages are:
1. Search the Site Map, to map the URL to the appropriate physical directory in the
filesystem.
2. Authenticate the user.
3. Authorize the possible specific defined permissions that the site node might have.
4. Process the URL, search for the appropriate file and server it.


 
3.2. Object system and services


Deep in OpenACS’ design is the notion that one should be able to build common services
that are useful across the toolkit. Hence a commenting engine ought work equally well
for blog posts as it does for images in photo galleries. Furthermore, any sufficiently
interesting piece of data in the system ought to carry basic accounting information with it,
such as who created it and timestamps for creation and modification.
The OpenACS object system addresses these requirements by defining a central SQL
table called acs_objects and giving this table a column for each generic bit of
information. Most importantly, acs_objects has a primary key column, named
object_id. This primary key is the anchor on which all other information about an
object rests. If the object is a blog post, the post body might live in a separate table
blog_posts whose primary key, post_id, is a reference back to
acs_objects.object_id. If the object is an image, it might contain a binary field
containing the image bits or alternatively a text field pointing to the physical storage
location of the image file, and also an image_id primary key that is a reference back to
acs_objects.object_id. Since each blog post and each image has a row in
acs_objects, comments on either can be inserted into a table that contains an
on_what_object column that points back to the object_id.
Any data that participates in the OpenACS object system can tell us its title, what kind of
object it is, when it was created and by whom. It can also be assigned permissions,
commented on, categorized, have files attached to it, and benefit from any other objectlevel
services that we can dream up.
The OpenACS object system, site map and instances are the foundation of OpenACS.
More information about these can be found at the following URLs:
http://openacs.org/doc/openacs-5-1/request-processor.html
http://openacs.org/doc/openacs-5-1/subsites.html
http://openacs.org/doc/openacs-5-1/packages.html


 3.3. Developer Support


OpenACS provides a set of developer support tools to improve the development process,
debugging, testing, and searching of the API. These tools enhance many of the day to day
activities of the developers.
The functionalities that developer support provides are:
1. Time to serve a given request. Good for performance problem detection and
improvement.
2. Tracing database calls involved in a given request, where the involved queries
are located, what was the actual query executed and how long did it take to return
the data. Useful for improving queries that might be slowing your application
performance.
3. Tracing scripts involved in serving a given request. Showing the time taken to
perform a given script, its location, code, and error that it might bring. Especially
important for tuning applications performance.
4. ADP reveal, to show in the browser which part of the rendered html belongs to a
given script.
5. User Switching, as an admin account switch easily to another account to
reproduce and test actions, and then simply go back to the administrator account.
6. OpenACS Shell, which is a Tcl shell with all the API available within OpenACS,
in order to simplify the testing of small pieces of Tcl within your browser.
An screenshot of the ADP Reveal:
The Tracing Script view, information shown at the bottom of each server page:
OpenACS also provides an implicit mechanism to document any script or procedure that
you might create, and then display and search that scripts or procedures, its documented
information, expected inputs and outputs, and its code though a Web interface at your
own OpenACS installation, like yourserver.com/api-doc, have a look here:
http://openacs.org/api-doc/
Finally, using the package manager, the Web server can be instructed to reload Tcl
library files without a restart of the webserver, and keep watching them for subsequent
changes, which is quite useful for the developers.


 
3.5. Testing Framework


OpenACS provides a full featured testing framework to create, maintain and run
automated test in your applications. The main characteristics of it are:
- Define tests, as smoke, config, database, web, etc. in a per package basis. And
with the API provided by the test framework you have a UI to check results, and
log events when executing a test case.
- Test your application at the proc level, using specific Test API to define and
check if your procs are behaving as expected.
- Test your end-user web scripts using a third party tool, such as tclwebtest or
perl::mechanize, to test the end-user pages automatically simulating end user
navigation (clicks) through the application and finally check the outpus using the
Test API.
- Define procedures that many tests might call, in order to automate similar
activities that need to be done before running a test case.
- Rollback the generated data after executing a test case.
- Rollback code section, to perform actions that will rollback the actions done by
running a test case, specially designed for those situations where you cannot
encapsulate the database calls to rollback, like when you are calling the scripts
with an automated tool.
- Run tests by installation, package or by choosing an specific test case.


 
3.6. Internationalization


OpenACS provide a facility to internationalize its user interface texts (not the data stored)
to any desired language, right now OpenACS is translated to more than 20 languages.
The end users can change the language user interface by simply selecting it. Each
OpenACS package can be internationalized separately, maintaining a message keys
catalog, which consist in a specific XML DTD where each possible user interface
message is stored as a message key, and then translated to a given language. Although all
the data is stored in the xml file, everything is also stored in the database, for
performance reasons and to facilitate the edition of the message keys OpenACS provides
a simple UI for translation of message key. And exists an official translation server for
the community at: translate.openacs.org.


3.7. Callbacks


The callbacks are a simple method to execute procedures stored in each of the possible
installed packages that a given OpenACS installation might have. The objective is to give
the core applications to invoke in non-core packages that may or may not be installed, but
without cluttering core with code that belongs to other packages, making possible to have
independence among packages.
The architecture work as

-> Core proc for removing user
-> Invoke callbacks based on what's installed
-> Package A logic for removing user
-> Package B logic for removing user
-> Package C logic for removing user
...
-> Core logic for removing user

 


as opposed to this architecture


-> Package A proc for removing user
-> Package A logic for removing user
-> Call core proc for removing user
-> Package B proc for removing user
-> Package B logic for removing user
-> Call core proc for removing user

 

Callback implementations would be declared like this:

ad_proc -callback module::op -implementation implname { ... }

 Where ad_proc is a wrapper of the normal TCL proc, which mainly gives autodocumentation
structure for any procedure.
Core uses tcl introspection to find which callbacks exist and invokes them. eg
foreach proc [info procs ::callback::module::op::impl::*] {
$proc $args
}
To invoke a callback you do this
callback [ -catch ] [ -impl impl ] callback [ args... ]
The callbacks is a great improvement in order to keep simple yet separated, coherent and
modular a web framework that constantly evolves and grows. The larger goal is to
promote reuse of standard pages and functions by removing the need to create perpackage
versions of these.

Domain level tools

Created by Anett Szabo, last modified by Anett Szabo 30 Jul 2007, at 01:40 PM

4.1. Content Repository


The Content Repository (CR) is a central service application that can be used by other
applications to manage its content. The developer must define and declare the
interactions with the content repository. The main features that the content repository are:
- CR is capable to store any kind of information (files, data, text).
- Define application specific data structures that will use the CR features and
services within the application context.
- Revision control, which means that every addition and subsequent changes of an
application is registered, so the versions are kept and can be roll-backed.
- Ability to handle hierarchies, and folder structures, and inherit its properties.
- Easily interact with the CR and its specific defined structures through a welldefined
API, in order to add data or present it.
- Also handles publish states for content that might need it.
- Identify content through a content type definition.
- Associate the content with external applications.
- Store specific templates to use when displaying specific content types.
- Create relations between content repository items or external database objects.
- Embedded search facility to automatically expose the content to the search engine
the system might be using.
The process for an application to use the content repository goes as:
1. Define the database structure your application needs that will use some of the CR
features.
2. Use the standardized API (to add, edit, delete, present) to create scripts for your
CR-enabled-application.
The CR logic is stored in database functions and triggers, but it has a Tcl API that glue all
the involved database calls and makes straightforward for the developers to create
applications that use the CR.


4.2. Results: Vertical Applications


Within OpenACS a vertical application means a set packages that interact together to
provide a specific domain set of functionalities.
The following are good examples of vertical applications completely running in
OpenACS and using Tcl as the programming language:


4.2.1. Project Manager

Full featured Project manager, consist in a set of functionalities that can be summarized
as: track tasks, estimates and actual progress for a project. It consist in more that 5
packages. Features that are part f it: project creation, customers management, task
assignments, logs, comments, create processes, rates, etc.


4.2.2. Assessment Tool

Extremely powerful application to create several types of assessments such as online selftests,
surveys, tests and gathering of information in general. The main features of this
application are: IMS-QTI support for import and export of assessments, branching,
sections, several types of questions and many more are part of it. More information at:
http://cvs.openacs.org/cvs/*checkout*/openacs-
4/packages/assessment/www/doc/index.html?rev=1.6


4.2.3. Communities of Practice

Communities of Practice within OpenACS are represented by a set of tools available
through all the website to interact with any kind of applications. The main objective is to
give the power to the end user describe in many forms any object at the system and make
relations among them. Those tools are:
- Rate
- Comment
- Categorize
- Link
- Search
And those tools can be used in all the applications that the end user might have available,
such as forums, file storage, etc. For example, you can categorize a given file, and then
link it to a forum thread, which you are also rating and commenting. Then a third user
will search for given topic, and the first result will be the file, and when he looks at the
file, he’ll see as well all its related objects, such as the forum thread.


4.2.4. .LRN and the E-LANE project

.LRN (pronounced dot-learn, www.dotlrn.org) is a full featured LMS with extra
community building capabilities, uses packages available for OpenACS and integrate
other vertical applications described here. Its focus is to help the creation of learning and
research communities, and not just server as an LMS.
The E-LANE Project (European-Latin America New Education, www.e-lane.org) is
using .LRN as its foundation technology to promote and demonstrate e-learning within
this geographical area, the demonstration is based on three fundamental factors: the
learning methodology, the content development, and the technological platform, the three
of them brought together to use it in real scenarios for further improvement of .LRN.
As a contribution to the .LRN community, E-LANE has developed among other
applications the user-tracking package, which is mainly focused to “Analyze users
behavior and syndicate it by object type”.
This package is basically using Tcl to create the UI and the administrative actions, and
glue them with awstats (http://awstats.sourceforge.net/), which in this case is the log
analyzer used to parse the AOLserver request logs, and those logs contains a set of
special keys written on each http request logged, which is the base to generate specific
request reports for a given user and/or object type.

Introduction

Created by Anett Szabo, last modified by Anett Szabo 30 Jul 2007, at 01:08 PM

 
The Open Architecture Community System (OpenACS) is a Web development
framework for building applications that support online communities.
OpenACS provides a robust infrastructure, building on top of the following standard
components: the Tcl programming language, a Postgres or Oracle database for storing the
application data, AOLserver for HTTP service and *nix or Windows operating systems.
Like other modern Web frameworks, OpenACS supports: templating for separating the
logic from the presentation, internationalization to present the user interface in the user’s
preferred language, a modular package system to create sub-applications, a role and
permissioning system, a content repository to store all manner of content and maintain
versioning.

SQL Tutorial

Created by Anett Szabo, last modified by Anett Szabo 26 Jul 2007, at 01:48 PM

What is SQL?

SQL (pronounced "ess-que-el") stands for Structured Query Language. SQL is used to communicate with a database. According to ANSI (American National Standards Institute), it is the standard language for relational database management systems. SQL statements are used to perform tasks such as update data on a database, or retrieve data from a database. Some common relational database management systems that use SQL are: Oracle, Sybase, Microsoft SQL Server, Access, Ingres, etc. Although most database systems use SQL, most of them also have their own additional proprietary extensions that are usually only used on their system. However, the standard SQL commands such as "Select", "Insert", "Update", "Delete", "Create", and "Drop" can be used to accomplish almost everything that one needs to do with a database. This tutorial will provide you with the instruction on the basics of each of these commands as well as allow you to put them to practice using the SQL Interpreter.

 

Table Basics

A relational database system contains one or more objects called tables. The data or information for the database are stored in these tables. Tables are uniquely identified by their names and are comprised of columns and rows. Columns contain the column name, data type, and any other attributes for the column. Rows contain the records or data for the columns. Here is a sample table called "weather".

city, state, high, and low are the columns. The rows contain the data for this table:

 

Weather
city state high low
Phoenix Arizona 105 90
Tucson Arizona 101 92
Flagstaff Arizona 88 69
San Diego California 77 60
Albuquerque New
Mexico
80 72

 

 

Selecting Data

The select statement is used to query the database and retrieve selected data that match the criteria that you specify. Here is the format of a simple select statement:

select "column1"
[,"column2",etc]
from "tablename"
[where "condition"];
[] = optional

The column names that follow the select keyword determine which columns will be returned in the results. You can select as many column names that you'd like, or you can use a "*" to select all columns.

The table name that follows the keyword from specifies the table that will be queried to retrieve the desired results.

The where clause (optional) specifies which data values or rows will be returned or displayed, based on the criteria described after the keyword where.

Conditional selections used in the where clause:

= Equal
> Greater than
< Less than
>= Greater than or equal
<= Less than or equal
<> Not equal to
LIKE *See note below

The LIKE pattern matching operator can also be used in the conditional selection of the where clause. Like is a very powerful operator that allows you to select only rows that are "like" what you specify. The percent sign "%" can be used as a wild card to match any possible character that might appear before or after the characters specified. For example:

 

Sample Table: empinfo
first last id age city state
John Jones 99980 45 Payson Arizona
Mary Jones 99982 25 Payson Arizona
Eric Edwards 88232 32 San Diego California
Mary Ann Edwards 88233 32 Phoenix Arizona
Ginger Howell 98002 42 Cottonwood Arizona
Sebastian Smith 92001 23 Gila Bend Arizona
Gus Gray 22322 35 Bagdad Arizona
Mary Ann May 32326 52 Tucson Arizona
Erica Williams 32327 60 Show Low Arizona
Leroy Brown 32380 22 Pinetop Arizona
Elroy Cleaver 32382 22 Globe Arizona

 

select first, last, city
from empinfo
where first LIKE 'Er%';

This SQL statement will match any first names that start with 'Er'. Strings must be in single quotes.

 

Or you can specify,

select first, last
from empinfo
where last LIKE '%s';

This statement will match any last names that end in a 's'.

select * from empinfo
where first = 'Eric';

This will only select rows where the first name equals 'Eric' exactly.

 

What will be the result of the following sample SELECT statements?

 

select first, last, city from empinfo; 

first last city
John Jones Payson
Mary Jones Payson
Eric Edwards San Diego
Mary Ann Edwards Phoenix
Ginger Howell Cottonwood
Sebastian Smith Gila Bend
Gus Gray Bagdad
Mary Ann May Tucson
Erica Williams Show Low
Leroy Brown Pinetop
Elroy Cleaver Globe
 

select last, city, age from empinfo 
where age > 30;

last city age
Jones Payson 45
Edwards San Diego 32
Edwards Phoenix 32
Howell Cottonwood 42
Gray Bagdad 35
May Tucson 52
Williams Show Low 60
 

 

select first, last, city, state from empinfo
       where first LIKE 'J%';

first last city state
John Jones Payson Arizona

 

select * from empinfo;

first last id age city state
John Jones 99980 45 Payson Arizona
Mary Jones 99982 25 Payson Arizona
Eric Edwards 88232 32 San Diego California
Mary Ann Edwards 88233 32 Phoenix Arizona
Ginger Howell 98002 42 Cottonwood Arizona
Sebastian Smith 92001 23 Gila Bend Arizona
Gus Gray 22322 35 Bagdad Arizona
Mary Ann May 32326 52 Tucson Arizona
Erica Williams 32327 60 Show Low Arizona
Leroy Brown 32380 22 Pinetop Arizona
Elroy Cleaver 32382 22 Globe Arizona

 

select first, last, from empinfo
       where last LIKE '%s'; 
 

first last
John Jones
Mary Jones
Eric Edwards
Mary Ann Edwards
Erica Williams

 

 select first, last, age from empinfo
       where last LIKE '%illia%';

first last age
Erica Williams 60

 

select * from empinfo where first = 'Eric'; 

first last id age city state
Eric Edwards 88232 32 San Diego California

 

 

Exercise 1.

Enter select statements to:

  1. Display the first name and age for everyone that's in the table.
  2. Display the first name, last name, and city for everyone that's not from Payson.
  3. Display all columns for everyone that is over 40 years old.
  4. Display the first and last names for everyone whose last name ends in an "ay".
  5. Display all columns for everyone whose first name equals "Mary".
  6. Display all columns for everyone whose first name contains "Mary".

Answers

 

Exercise 2.

items_ordered

customerid order_date item quantity price
10330 30-Jun-1999 Pogo stick 1 28.00
10101 30-Jun-1999 Raft 1 58.00
10298 01-Jul-1999 Skateboard 1 33.00
10101 01-Jul-1999 Life Vest 4 125.00
10299 06-Jul-1999 Parachute 1 1250.00
10339 27-Jul-1999 Umbrella 1 4.50
10449 13-Aug-1999 Unicycle 1 180.79
10439 14-Aug-1999 Ski Poles 2 25.50
10101 18-Aug-1999 Rain Coat 1 18.30
10449 01-Sep-1999 Snow Shoes 1 45.00
10439 18-Sep-1999 Tent 1 88.00
10298 19-Sep-1999 Lantern 2 29.00
10410 28-Oct-1999 Sleeping Bag 1 89.22
10438 01-Nov-1999 Umbrella 1 6.75
10438 02-Nov-1999 Pillow 1 8.50
10298 01-Dec-1999 Helmet 1 22.00
10449 15-Dec-1999 Bicycle 1 380.50
10449 22-Dec-1999 Canoe 1 280.00
10101 30-Dec-1999 Hoola Hoop 3 14.75
10330 01-Jan-2000 Flashlight 4 28.00
10101 02-Jan-2000 Lantern 1 16.00
10299 18-Jan-2000 Inflatable Mattress 1 38.00
10438 18-Jan-2000 Tent 1 79.99
10413 19-Jan-2000 Lawnchair 4 32.00
10410 30-Jan-2000 Unicycle 1 192.50
10315 2-Feb-2000 Compass 1 8.00
10449 29-Feb-2000 Flashlight 1 4.50
10101 08-Mar-2000 Sleeping Bag 2 88.70
10298 18-Mar-2000 Pocket Knife 1 22.38
10449 19-Mar-2000 Canoe paddle 2 40.00
10298 01-Apr-2000 Ear Muffs 1 12.50
10330 19-Apr-2000 Shovel 1 16.75

 

Enter select statements to:

  1. From the items_ordered table, select a list of all items purchased for customerid 10449. Display the customerid, item, and price for this customer.
  2. Select all columns from the items_ordered table for whoever purchased a Tent.
  3. Select the customerid, order_date, and item values from the items_ordered table for any items in the item column that start with the letter "S".
  4. Select the distinct items in the items_ordered table. In other words, display a listing of each of the unique items from the items_ordered table.
  5. Make up your own select statements and submit them.

Answers

 

 

Aggregate Functions

 

MIN returns the smallest value in a given column
MAX returns the largest value in a given column
SUM returns the sum of the numeric values in a given column
AVG returns the average value of a given column
COUNT returns the total number of values in a given column
COUNT(*) returns the number of rows in a table

 

Aggregate functions are used to compute against a "returned column of numeric data" from your SELECT statement. They basically summarize the results of a particular column of selected data. We are covering these here since they are required by the next topic, "GROUP BY". Although they are required for the "GROUP BY" clause, these functions can be used without the "GROUP BY" clause. For example:

SELECT AVG(salary)
FROM employee;

This statement will return a single result which contains the average value of everything returned in the salary column from the employee table.

Another example:

SELECT AVG(salary)
    FROM employee;
WHERE title = 'Programmer';

 This statement will return the average salary for all employees whose title is equal to 'Programmer'

 Example:

valueprice_apiece
110
1009
5008
10007

 

 

Example:

SELECT Count(*)
FROM employees;

This particular statement is slightly different from the other aggregate functions since there isn't a column supplied to the count function. This statement will return the number of rows in the employees table.

 

 

Exercise 3.

  1. Select the maximum price of any item ordered in the items_ordered table ( see above ). Hint: Select the maximum price only.>
  2. Select the average price of all of the items ordered that were purchased in the month of Dec.
  3. What are the total number of rows in the items_ordered table?
  4. For all of the tents that were ordered in the items_ordered table, what is the price of the lowest tent? Hint: Your query should return the price only.

Answers

 

 

GROUP BY clause

 

The GROUP BY clause will gather all of the rows together that contain data in the specified column(s) and will allow aggregate functions to be performed on the one or more columns. This can best be explained by an example:

GROUP BY clause syntax:

SELECT column1, 
SUM(column2)
FROM "list-of-tables"
GROUP BY "column-list";

Let's say you would like to retrieve a list of the highest paid salaries in each dept:

SELECT max(salary), dept
FROM employee
GROUP BY dept;

This statement will select the maximum salary for the people in each unique department. Basically, the salary for the person who makes the most in each department will be displayed. Their, salary and their department will be returned.

 

What if I wanted to display their lastname too?

For example, take a look at the items_ordered table. Let's say you want to group everything of quantity 1 together, everything of quantity 2 together, everything of quantity 3 together, etc. If you would like to determine what the largest cost item is for each grouped quantity (all quantity 1's, all quantity 2's, all quantity 3's, etc.), you would enter:

SELECT quantity, max(price)
FROM items_ordered
GROUP BY quantity;

Enter the statement in above, and take a look at the results to see if it returned what you were expecting. Verify that the maximum price in each Quantity Group is really the maximum price.

 

Exercise 4.

customerid firstname lastname city state
10101 John Gray Lynden Washington
10298 Leroy Brown Pinetop Arizona
10299 Elroy Keller Snoqualmie Washington
10315 Lisa Jones Oshkosh Wisconsin
10325 Ginger Schultz Pocatello Idaho
10329 Kelly Mendoza Kailua Hawaii
10330 Shawn Dalton Cannon Beach Oregon
10338 Michael Howell Tillamook Oregon
10339 Anthony Sanchez Winslow Arizona
10408 Elroy Cleaver Globe Arizona
10410 Mary Ann Howell Charleston South Carolina
10413 Donald Davids Gila Bend Arizona
10419 Linda Sakahara Nogales Arizona
10429 Sarah Graham Greensboro North Carolina
10438 Kevin Smith Durango Colorado
10439 Conrad Giles Telluride Colorado
10449 Isabela Moore Yuma Arizona

 

  1. How many people are in each unique state in the customers table? Select the state and display the number of people in each. Hint: count is used to count rows in a column, sum works on numeric data only.
  2. From the items_ordered table, select the item, maximum price, and minimum price for each specific item in the table. Hint: The items will need to be broken up into separate groups.
  3. How many orders did each customer make? Use the items_ordered table. Select the customerid, number of orders they made, and the sum of their orders. Click the Group By answers link below if you have any problems.

Answers

 

 

HAVING clause

 

The HAVING clause allows you to specify conditions on the rows for each group - in other words, which rows should be selected will be based on the conditions you specify. The HAVING clause should follow the GROUP BY clause if you are going to use it.

HAVING clause syntax:

SELECT column1, 
SUM(column2)
FROM "list-of-tables"
GROUP BY "column-list"
HAVING "condition";

HAVING can best be described by example. Let's say you have an employee table containing the employee's name, department, salary, and age. If you would like to select the average salary for each employee in each department, you could enter:

SELECT dept, avg(salary)
FROM employee
GROUP BY dept;

But, let's say that you want to ONLY calculate & display the average if their salary is over 20000:

SELECT dept, avg(salary)
FROM employee
GROUP BY dept
HAVING avg(salary) > 20000;

 

Exercise 5.

  1. How many people are in each unique state in the customers table that have more than one person in the state? Select the state and display the number of how many people are in each if it's greater than 1.
  2. From the items_ordered table, select the item, maximum price, and minimum price for each specific item in the table. Only display the results if the maximum price for one of the items is greater than 190.00.
  3. How many orders did each customer make? Use the items_ordered table. Select the customerid, number of orders they made, and the sum of their orders if they purchased more than 1 item.

Answers

 

 

ORDER BY clause

 

ORDER BY is an optional clause which will allow you to display the results of your query in a sorted order (either ascending order or descending order) based on the columns that you specify to order by. 

 ORDER BY clause syntax:

SELECT column1, SUM(column2)
FROM "list-of-tables"
ORDER BY "column-list" [ASC | DESC];

[ ] = optional

This statement will select the employee_id, dept, name, age, and salary from the employee_info table where the dept equals 'Sales' and will list the results in Ascending (default) order based on their Salary.

ASC = Ascending Order - default
DESC = Descending Order


For example:

SELECT employee_id, dept, name, age, salary
FROM employee_info
WHERE dept = 'Sales'ORDER BY salary;

 

If you would like to order based on multiple columns, you must seperate the columns with commas. For example:

SELECT employee_id, dept, name, age, salary
FROM employee_info
WHERE dept = 'Sales'
ORDER BY salary, age DESC;

 

Exercise 6.

  1. Select the lastname, firstname, and city for all customers in the customers table. Display the results in Ascending Order based on the lastname.
  2. Same thing as exercise #1, but display the results in Descending order.
  3. Select the item and price for all of the items in the items_ordered table that the price is greater than 10.00. Display the results in Ascending order based on the price.

Answers

 

 

Combining conditions and Boolean Operators

 

The AND operator can be used to join two or more conditions in the WHERE clause. Both sides of the AND condition must be true in order for the condition to be met and for those rows to be displayed.

SELECT column1, 
SUM(column2)
FROM "list-of-tables"
WHERE "condition1" AND
"condition2";

The OR operator can be used to join two or more conditions in the WHERE clause also. However, either side of the OR operator can be true and the condition will be met - hence, the rows will be displayed. With the OR operator, either side can be true or both sides can be true.

For example:

SELECT employeeid, firstname, lastname, title, salary
FROM employee_info
WHERE salary >= 50000.00 AND title = 'Programmer';

 

This statement will select the employeeid, firstname, lastname, title, and salary from the employee_info table where the salary is greater than or equal to 50000.00 AND the title is equal to 'Programmer'. Both of these conditions must be true in order for the rows to be returned in the query. If either is false, then it will not be displayed.

Although they are not required, you can use paranthesis around your conditional expressions to make it easier to read:

SELECT employeeid, firstname, lastname, title, salary
FROM employee_info
WHERE (salary >= 50000.00) AND (title = 'Programmer');


Another Example:

SELECT firstname, lastname, title, salary
FROM employee_info
WHERE (title = 'Sales') OR (title = 'Programmer');

This statement will select the firstname, lastname, title, and salary from the employee_info table where the title is either equal to 'Sales' OR the title is equal to 'Programmer'.

 

Exercise 7.

  1. Select the customerid, order_date, and item from the items_ordered table for all items unless they are 'Snow Shoes' or if they are 'Ear Muffs'. Display the rows as long as they are not either of these two items.
  2. Select the item and price of all items that start with the letters 'S', 'P', or 'F'.

Answers

 

 

IN and BETWEEN Conditional Operators

 

SELECT col1, SUM(col2)
FROM "list-of-tables"
WHERE col3 IN
       (list-of-values);
SELECT col1, SUM(col2)
FROM "list-of-tables"
WHERE col3 BETWEEN value1
AND value2;

The IN conditional operator is really a set membership test operator. That is, it is used to test whether or not a value (stated before the keyword IN) is "in" the list of values provided after the keyword IN.

For example:

SELECT employeeid, lastname, salary
FROM employee_info
WHERE lastname IN ('Hernandez', 'Jones', 'Roberts', 'Ruiz');


This statement will select the employeeid, lastname, salary from the employee_info table where the lastname is equal to either: Hernandez, Jones, Roberts, or Ruiz. It will return the rows if it is ANY of these values.

The IN conditional operator can be rewritten by using compound conditions using the equals operator and combining it with OR - with exact same output results:

SELECT employeeid, lastname, salary
FROM employee_info
WHERE lastname = 'Hernandez' OR lastname = 'Jones'
OR lastname = 'Roberts'OR lastname = 'Ruiz';

As you can see, the IN operator is much shorter and easier to read when you are testing for more than two or three values.
You can also use NOT IN to exclude the rows in your list.

The BETWEEN conditional operator is used to test to see whether or not a value (stated before the keyword BETWEEN) is "between" the two values stated after the keyword BETWEEN.

For example:

SELECT employeeid, age, lastname, salary
FROM employee_info
WHERE age BETWEEN 30 AND 40;

This statement will select the employeeid, age, lastname, and salary from the employee_info table where the age is between 30 and 40 (including 30 and 40).

This statement can also be rewritten without the BETWEEN operator:

SELECT employeeid, age, lastname, salary
FROM employee_info
WHERE age >= 30 AND age <= 40;

You can also use NOT BETWEEN to exclude the values between your range.

 

Exercise 8.

  1. Select the date, item, and price from the items_ordered table for all of the rows that have a price value ranging from 10.00 to 80.00.
  2. Select the firstname, city, and state from the customers table for all of the rows where the state value is either: Arizona, Washington, Oklahoma, Colorado, or Hawaii.

Answers

 

 

Mathematical Operators

 

Standard ANSI SQL-92 supports the following first four basic arithmetic operators:

+ addition
- subtraction
* multiplication
/ division
% modulo

The modulo operator determines the integer remainder of the division. This operator is not ANSI SQL supported, however, most databases support it. The following are some more useful mathematical functions to be aware of since you might need them. These functions are not standard in the ANSI SQL-92 specs, therefore they may or may not be available on the specific RDBMS that you are using. However, they were available on several major database systems that I tested. They WILL work on this tutorial.

ABS(x) returns the absolute value of x
SIGN(x) returns the sign of input x as -1, 0, or 1 (negative, zero, or positive respectively)
MOD(x,y) modulo - returns the integer remainder of x divided by y (same as x%y)
FLOOR(x) returns the largest integer value that is less than or equal to x
CEILING(x) or CEIL(x) returns the smallest integer value that is greater than or equal to x
POWER(x,y) returns the value of x raised to the power of y
ROUND(x) returns the value of x rounded to the nearest whole integer
ROUND(x,d) returns the value of x rounded to the number of decimal places specified by the value d
SQRT(x) returns the square-root value of x

For example:

SELECT round(salary), firstname
FROM employee_info

This statement will select the salary rounded to the nearest whole value and the firstname from the employee_info table.

 

Exercise 9.

Select the item and per unit price for each item in the items_ordered table. Hint: Divide the price by the quantity.

Answer

 

 

Table Joins, a must

 

All of the queries up until this point have been useful with the exception of one major limitation - that is, you've been selecting from only one table at a time with your SELECT statement. It is time to introduce you to one of the most beneficial features of SQL & relational database systems - the "Join". To put it simply, the "Join" makes relational database systems "relational".

Joins allow you to link data from two or more tables together into a single query result--from one single SELECT statement.

A "Join" can be recognized in a SQL SELECT statement if it has more than one table after the FROM keyword.

For example:

SELECT "list-of-columns"
FROM table1,table2
WHERE "search-condition(s)"

Joins can be explained easier by demonstrating what would happen if you worked with one table only, and didn't have the ability to use "joins". This single table database is also sometimes referred to as a "flat table". Let's say you have a one-table database that is used to keep track of all of your customers and what they purchase from your store:

idfirst lastaddresscitystate zipdateitemprice

 

Everytime a new row is inserted into the table, all columns will be be updated, thus resulting in unnecessary "redundant data". For example, every time Wolfgang Schultz purchases something, the following rows will be inserted into the table:

idfirstlastaddress            citystatezipdateitemprice
10982WolfgangSchultz300 N. 1st AveYumaAZ85002032299snowboard45.00
10982WolfgangSchultz300 N. 1st AveYumaAZ85002082899snow shovel35.00
10982WolfgangSchultz300 N. 1st AveYumaAZ85002091199gloves15.00
10982WolfgangSchultz300 N. 1st AveYumaAZ85002100999lantern35.00
10982WolfgangSchultz300 N. 1st AveYumaAZ85002022900tent85.00

An ideal database would have two tables:

  1. One for keeping track of your customers
  2. And the other to keep track of what they purchase:

"Customer_info" table:

customer_numberfirstname lastnameaddresscitystate zip

"Purchases" table:

customer_numberdate itemprice

Now, whenever a purchase is made from a repeating customer, the 2nd table, "Purchases" only needs to be updated! We've just eliminated useless redundant data, that is, we've just normalized this database!

Notice how each of the tables have a common "cusomer_number" column. This column, which contains the unique customer number will be used to JOIN the two tables. Using the two new tables, let's say you would like to select the customer's name, and items they've purchased. Here is an example of a join statement to accomplish this:

SELECT customer_info.firstname, customer_info.lastname, purchases.item
FROM customer_info, purchases
WHERE customer_info.customer_number = purchases.customer_number;

This particular "Join" is known as an "Inner Join" or "Equijoin". This is the most common type of "Join" that you will see or use.

Notice that each of the colums are always preceeded with the table name and a period. This isn't always required, however, it IS good practice so that you wont confuse which colums go with what tables. It is required if the name column names are the same between the two tables. I recommend preceeding all of your columns with the table names when using joins.

Note: The syntax described above will work with most Database Systems -including the one with this tutorial. However, in the event that this doesn't work with yours, please check your specific database documentation.

Although the above will probably work, here is the ANSI SQL-92 syntax specification for an Inner Join using the preceding statement above that you might want to try:

SELECT customer_info.firstname, customer_info.lastname, purchases.item
FROM customer_info INNER JOIN purchases
ON customer_info.customer_number = purchases.customer_number;

SELECT customer_info.firstname, customer_info.lastname, purchases.item
FROM customer_info INNER JOIN purchases
ON customer_info.customer_number = purchases.customer_number;


Another example:

SELECT employee_info.employeeid, employee_info.lastname, employee_sales.comission
FROM employee_info, employee_sales
WHERE employee_info.employeeid = employee_sales.employeeid;
 

This statement will select the employeeid, lastname (from the employee_info table), and the comission value (from the employee_sales table) for all of the rows where the employeeid in the employee_info table matches the employeeid in the employee_sales table.

Exercises 10. 

  1. Write a query using a join to determine which items were ordered by each of the customers in the customers table. Select the customerid, firstname, lastname, order_date, item, and price for everything each customer purchased in the items_ordered table.
  2. Repeat exercise #1, however display the results sorted by state in descending order.
Answers

 

 

Creating Tables

The create table statement is used to create a new table. Here is the format of a simple create table statement:

create table "tablename"
("column1" "data type",
"column2" "data type",
"column3" "data type");

Format of create table if you were to use optional constraints:

create table "tablename"
("column1" "data type"
[constraint],
"column2" "data type"
[constraint],
"column3" "data type"
[constraint]);
[ ] = optional

Note: You may have as many columns as you'd like, and the constraints are optional.

 

Example:

create table employee
(first varchar(15),
last varchar(20),
age number(3),
address varchar(30),
city varchar(20),
state varchar(20));

 

To create a new table, enter the keywords create table followed by the table name, followed by an open parenthesis, followed by the first column name, followed by the data type for that column, followed by any optional constraints, and followed by a closing parenthesis. It is important to make sure you use an open parenthesis before the beginning table, and a closing parenthesis after the end of the last column definition. Make sure you seperate each column definition with a comma. All SQL statements should end with a ";".

The table and column names must start with a letter and can be followed by letters, numbers, or underscores - not to exceed a total of 30 characters in length. Do not use any SQL reserved keywords as names for tables or column names (such as "select", "create", "insert", etc).

Data types specify what the type of data can be for that particular column. If a column called "Last_Name", is to be used to hold names, then that particular column should have a "varchar" (variable-length character) data type.

Here are the most common Data types:

char(size)Fixed-length character string. Size is specified in parenthesis. Max 255 bytes.
varchar(size)Variable-length character string. Max size is specified in parenthesis.
number(size)Number value with a max number of column digits specified in parenthesis.
dateDate value
number(size,d)Number value with a maximum number of digits of "size" total, with a maximum number of "d" digits to the right of the decimal.

What are constraints? When tables are created, it is common for one or more columns to have constraints associated with them. A constraint is basically a rule associated with a column that the data entered into that column must follow. For example, a "unique" constraint specifies that no two records can have the same value in a particular column. They must all be unique. The other two most popular constraints are "not null" which specifies that a column can't be left blank, and "primary key". A "primary key" constraint defines a unique identification of each record (or row) in a table. All of these and more will be covered in the future Advanced release of this Tutorial. Constraints can be entered in this SQL interpreter, however, they are not supported in this Intro to SQL tutorial & interpreter. They will be covered and supported in the future release of the Advanced SQL tutorial - that is, if "response" is good.

It's now time for you to design and create your own table. You will use this table throughout the rest of the tutorial. If you decide to change or redesign the table, you can either drop it and recreate it or you can create a completely different one. The SQL statement drop will be covered later.

 

Exercise 11. 

You have just started a new company. It is time to hire some employees. You will need to create a table that will contain the following information about your new employees: firstname, lastname, title, age, and salary. 

Answer

 

 

Inserting into a Table

The insert statement is used to insert or add a row of data into the table.

To insert records into a table, enter the key words insert into followed by the table name, followed by an open parenthesis, followed by a list of column names separated by commas, followed by a closing parenthesis, followed by the keyword values, followed by the list of values enclosed in parenthesis. The values that you enter will be held in the rows and they will match up with the column names that you specify. Strings should be enclosed in single quotes, and numbers should not.

insert into "tablename"
(first_column,...last_column)
values (first_value,...last_value);

 

In the example below, the column name first will match up with the value 'Luke', and the column name state will match up with the value 'Georgia'.

Example:

insert into employee
(first, last, age, address, city, state)
values ('Luke', 'Duke', 45, '2130 Boars Nest',
'Hazard Co', 'Georgia');

 

Note: All strings should be enclosed between single quotes: 'string'

 

Exercise 12. 

It is time to insert data into your new employee table.

Your first three employees are the following:

Jonie Weber, Secretary, 28, 19500.00
Potsy Weber, Programmer, 32, 45300.00
Dirk Smith, Programmer II, 45, 75020.00

Enter these employees into your table first, and then insert at least 5 more of your own list of employees in the table.

After they're inserted into the table, enter select statements to:

  1. Select all columns for everyone in your employee table.
  2. Select all columns for everyone with a salary over 30000.
  3. Select first and last names for everyone that's under 30 years old.
  4. Select first name, last name, and salary for anyone with "Programmer" in their title.
  5. Select all columns for everyone whose last name contains "ebe".
  6. Select the first name for everyone whose first name equals "Potsy".
  7. Select all columns for everyone over 80 years old.
  8. Select all columns for everyone whose last name ends in "ith".

 

Answers

 

 

Updating Records

The update statement is used to update or change records that match a specified criteria. This is accomplished by carefully constructing a where clause.

update "tablename"
set "columnname" =
"newvalue"
[,"nextcolumn" =
"newvalue2"...]
where "columnname"
OPERATOR "value"
[and|or "column"
OPERATOR "value"];

[] = optional

 [The above example was line wrapped for better viewing on this Web page.]

 

Examples:

update phone_book
set area_code = 623
where prefix = 979;

update phone_book
set last_name = 'Smith', prefix=555, suffix=9292
where last_name = 'Jones';

update employee
set age = age+1
where first_name='Mary' and last_name='Williams';

 

Exercise 13.

After each update, issue a select statement to verify your changes.

  1. Jonie Weber just got married to Bob Williams. She has requested that her last name be updated to Weber-Williams.
  2. Dirk Smith's birthday is today, add 1 to his age.
  3. All secretaries are now called "Administrative Assistant". Update all titles accordingly.
  4. Everyone that's making under 30000 are to receive a 3500 a year raise.
  5. Everyone that's making over 33500 are to receive a 4500 a year raise.
  6. All "Programmer II" titles are now promoted to "Programmer III".
  7. All "Programmer" titles are now promoted to "Programmer II".

Answers

 

 

Deleting Records

The delete statement is used to delete records or rows from the table.

delete from "tablename"

where "columnname"
OPERATOR "value"
[and|or "column"
OPERATOR "value"];

[ ] = optional

[The above example was line wrapped for better viewing on this Web page.]

Examples:

delete from employee;

 Note: if you leave off the where clause, all records will be deleted!

 

delete from employee
where lastname = 'May';

delete from employee
where firstname = 'Mike' or firstname = 'Eric';


To delete an entire record/row from a table, enter "delete from" followed by the table name, followed by the where clause which contains the conditions to delete. If you leave off the where clause, all records will be deleted.

 

Exercise 14.

(Use the select statement to verify your deletes):

  1. Jonie Weber-Williams just quit, remove her record from the table.
  2. It's time for budget cuts. Remove all employees who are making over 70000 dollars.

Answers

 

 

Drop a Table

The drop table command is used to delete a table and all rows in the table.

To delete an entire table including all of its rows, issue the drop table command followed by the tablename. drop table is different from deleting all of the records in the table. Deleting all of the records in the table leaves the table including column and constraint information. Dropping the table removes the table definition as well as all of its rows.

drop table "tablename"

 

Example:

drop table myemployees_ts0211;

 
 

Exercise 15.

Drop your employee table.

 

 

 

---

based on http://sqlcourse2.com/ 


 

 

 

Simple queries

Created by Anett Szabo, last modified by Anett Szabo 26 Jul 2007, at 12:11 PM

If you start up SQL*Plus, you can start browsing around immediately with the SELECT statement. You don't even need to define a table; Oracle provides the built-in dual table for times when you're interested in a constant or a function:


SQL> select 'Hello World' from dual;

'HELLOWORLD
-----------
Hello World

SQL> select 2+2 from dual;

2+2
----------
4

SQL> select sysdate from dual;

SYSDATE
----------
1999-02-14
... or to test your knowledge of three-valued logic (see the "Data Modeling" chapter):

SQL> select 4+NULL from dual;

4+NULL
----------

(any expression involving NULL evaluates to NULL).

There is nothing magic about the dual table for these purposes; you can compute functions using the bboard table instead of dual:


select sysdate,2+2,atan2(0, -1) from bboard;

SYSDATE 2+2 ATAN2(0,-1)
---------- ---------- -----------
1999-01-14 4 3.14159265
1999-01-14 4 3.14159265
1999-01-14 4 3.14159265
1999-01-14 4 3.14159265
...

1999-01-14 4 3.14159265
1999-01-14 4 3.14159265
1999-01-14 4 3.14159265

55010 rows selected.
but not everyone wants 55010 copies of the same result. The dual table is predefined during Oracle installation and, though it is just a plain old table, it is guaranteed to contain only one row because no user will have sufficient privileges to insert or delete rows from dual.

 

Getting beyond Hello World

To get beyond Hello World, pick a table of interest. As we saw in the introduction,
select * from users;
would retrieve all the information from every row of the users table. That's good for toy systems but in any production system, you'd be better off starting with

SQL> select count(*) from users;

COUNT(*)
----------
7352
You don't really want to look at 7352 rows of data, but you would like to see what's in the users table, start off by asking SQL*Plus to query Oracle's data dictionary and figure out what columns are available in the users table:

SQL> describe users
Name Null? Type
------------------------------- -------- ----
USER_ID NOT NULL NUMBER(38)
FIRST_NAMES NOT NULL VARCHAR2(100)
LAST_NAME NOT NULL VARCHAR2(100)
PRIV_NAME NUMBER(38)
EMAIL NOT NULL VARCHAR2(100)
PRIV_EMAIL NUMBER(38)
EMAIL_BOUNCING_P CHAR(1)
PASSWORD NOT NULL VARCHAR2(30)
URL VARCHAR2(200)
ON_VACATION_UNTIL DATE
LAST_VISIT DATE
SECOND_TO_LAST_VISIT DATE
REGISTRATION_DATE DATE
REGISTRATION_IP VARCHAR2(50)
ADMINISTRATOR_P CHAR(1)
DELETED_P CHAR(1)
BANNED_P CHAR(1)
BANNING_USER NUMBER(38)
BANNING_NOTE VARCHAR2(4000)
The data dictionary is simply a set of built-in tables that Oracle uses to store information about the objects (tables, triggers, etc.) that have been defined. Thus SQL*Plus isn't performing any black magic when you type describe; it is simply querying user_tab_columns, a view of some of the tables in Oracle's data dictionary. You could do the same explicitly, but it is a little cumbersome.

column fancy_type format a20
select column_name, data_type || '(' || data_length || ')' as fancy_type
from user_tab_columns
where table_name = 'USERS'
order by column_id;
Here we've had to make sure to put the table name ("USERS") in all-uppercase. Oracle is case-insensitive for table and column names in queries but the data dictionary records names in uppercase. Now that we know the names of the columns in the table, it will be easy to explore.

 

Simple Queries from One Table

A simple query from one table has the following structure:
  • the select list (which columns in our report)
  • the name of the table
  • the where clauses (which rows we want to see)
  • the order by clauses (how we want the rows arranged)
Let's see some examples. First, let's see how many users from MIT are registered on our site:

SQL> select email
from users
where email like '%mit.edu';

EMAIL
------------------------------
philg@mit.edu
andy@california.mit.edu
ben@mit.edu
...
wollman@lcs.mit.edu
ghomsy@mit.edu
hal@mit.edu
...
jpearce@mit.edu
richmond@alum.mit.edu
andy_roo@mit.edu
kov@mit.edu
fletch@mit.edu
lsandon@mit.edu
psz@mit.edu
philg@ai.mit.edu
philg@martigny.ai.mit.edu
andy@californnia.mit.edu
ty@mit.edu
teadams@mit.edu

68 rows selected.

The email like '%mit.edu' says "every row where the email column ends in 'mit.edu'". The percent sign is Oracle's wildcard character for "zero or more characters". Underscore is the wildcard for "exactly one character":

SQL> select email
from users
where email like '___@mit.edu';

EMAIL
------------------------------
kov@mit.edu
hal@mit.edu
...
ben@mit.edu
psz@mit.edu
Suppose that we notice in the above report some similar email addresses. It is perhaps time to try out the ORDER BY clause:

SQL> select email
from users
where email like '%mit.edu'
order by email;

EMAIL
------------------------------
andy@california.mit.edu
andy@californnia.mit.edu
andy_roo@mit.edu
...
ben@mit.edu
...
hal@mit.edu
...
philg@ai.mit.edu
philg@martigny.ai.mit.edu
philg@mit.edu
Now we can see that this users table was generated by grinding over pre-ArsDigita Community Systems postings starting from 1995. In those bad old days, users typed their email address and name with each posting. Due to typos and people intentionally choosing to use different addresses at various times, we can see that we'll have to build some sort of application to help human beings merge some of the rows in the users table (e.g., all three occurrences of "philg" are in fact the same person (me)).

 

Restricting results

Suppose that you were featured on Yahoo in September 1998 and want to see how many users signed up during that month:

SQL> select count(*)
from users
where registration_date >= '1998-09-01'
and registration_date < '1998-10-01';

COUNT(*)
----------
920
We've combined two restrictions in the WHERE clause with an AND. We can add another restriction with another AND:

SQL> select count(*)
from users
where registration_date >= '1998-09-01'
and registration_date < '1998-10-01'
and email like '%mit.edu';

COUNT(*)
----------
35
OR and NOT are also available within the WHERE clause. For example, the following query will tell us how many classified ads we have that either have no expiration date or whose expiration date is later than the current date/time.

select count(*)
from classified_ads
where expires >= sysdate
or expires is null;

 

Subqueries

You can query one table, restricting the rows returned based on information from another table. For example, to find users who have posted at least one classified ad:

select user_id, email
from users
where 0 < (select count(*)
from classified_ads
where classified_ads.user_id = users.user_id);

USER_ID EMAIL
---------- -----------------------------------
42485 twm@meteor.com
42489 trunghau@ecst.csuchico.edu
42389 ricardo.carvajal@kbs.msu.edu
42393 gon2foto@gte.net
42399 rob@hawaii.rr.com
42453 stefan9@ix.netcom.com
42346 silverman@pon.net
42153 gallen@wesleyan.edu
...
Conceptually, for each row in the users table Oracle is running the subquery against classified_ads to see how many ads are associated with that particular user ID. Keep in mind that this is only conceptually; the Oracle SQL parser may elect to execute this query in a more efficient manner.

Another way to describe the same result set is using EXISTS:


select user_id, email
from users
where exists (select 1
from classified_ads
where classified_ads.user_id = users.user_id);
This may be more efficient for Oracle to execute since it hasn't been instructed to actually count the number of classified ads for each user, but only to check and see if any are present. Think of EXISTS as a Boolean function that
  1. takes a SQL query as its only parameter
  2. returns TRUE if the query returns any rows at all, regardless of the contents of those rows (this is why we can use the constant 1 as the select list for the subquery)

JOIN

A professional SQL programmer would be unlikely to query for users who'd posted classified ads in the preceding manner. The SQL programmer knows that, inevitably, the publisher will want information from the classified ad table along with the information from the users table. For example, we might want to see the users and, for each user, the sequence of ad postings:

select users.user_id, users.email, classified_ads.posted
from users, classified_ads
where users.user_id = classified_ads.user_id
order by users.email, posted;

USER_ID EMAIL POSTED
---------- ----------------------------------- ----------
39406 102140.1200@compuserve.com 1998-09-30
39406 102140.1200@compuserve.com 1998-10-08
39406 102140.1200@compuserve.com 1998-10-08
39842 102144.2651@compuserve.com 1998-07-02
39842 102144.2651@compuserve.com 1998-07-06
39842 102144.2651@compuserve.com 1998-12-13
...
41284 yme@inetport.com 1998-01-25
41284 yme@inetport.com 1998-02-18
41284 yme@inetport.com 1998-03-08
35389 zhupanov@usa.net 1998-12-10
35389 zhupanov@usa.net 1998-12-10
35389 zhupanov@usa.net 1998-12-10
Because of the JOIN restriction, where users.user_id = classified_ads.user_id, we only see those users who have posted at least one classified ad, i.e., for whom a matching row may be found in the classified_ads table. This has the same effect as the subquery above.

The order by users.email, posted is key to making sure that the rows are lumped together by user and then printed in order of ascending posting time.

 

OUTER JOIN

Suppose that we want an alphabetical list of all of our users, with classified ad posting dates for those users who have posted classifieds. We can't do a simple JOIN because that will exclude users who haven't posted any ads. What we need is an OUTER JOIN, where Oracle will "stick in NULLs" if it can't find a corresponding row in the classified_ads table.

select users.user_id, users.email, classified_ads.posted
from users, classified_ads
where users.user_id = classified_ads.user_id(+)
order by users.email, posted;

...
USER_ID EMAIL POSTED
---------- ----------------------------------- ----------
52790 dbrager@mindspring.com
37461 dbraun@scdt.intel.com
52791 dbrenner@flash.net
47177 dbronz@free.polbox.pl
37296 dbrouse@enter.net
47178 dbrown@cyberhighway.net
36985 dbrown@uniden.com 1998-03-05
36985 dbrown@uniden.com 1998-03-10
34283 dbs117@amaze.net
52792 dbsikorski@yahoo.com
...
The plus sign after classified_ads.user_id is our instruction to Oracle to "add NULL rows if you can't meet this JOIN constraint".

 

Extending a simple query into a JOIN

Suppose that you have a query from one table returning almost everything that you need, except for one column that's in another table. Here's a way to develop the JOIN without risking breaking your application:
  1. add the new table to your FROM clause
  2. add a WHERE constraint to prevent Oracle from building a Cartesian product
  3. hunt for ambiguous column names in the SELECT list and other portions of the query; prefix these with table names if necessary
  4. test that you've not broken anything in your zeal to add additional info
  5. add a new column to the SELECT list
Here's an example from Problem Set 2 of a course that we give at MIT (see http://www.photo.net/teaching/psets/ps2/ps2.adp). Students build a conference room reservation system. They generally define two tables: rooms and reservations. The top level page is supposed to show a user what reservations he or she is current holding:

select room_id, start_time, end_time
from reservations
where user_id = 37
This produces an unacceptable page because the rooms are referred to by an ID number rather than by name. The name information is in the rooms table, so we'll have to turn this into a JOIN.

 

Step 1: add the new table to the FROM clause


select room_id, start_time, end_time
from reservations, rooms
where user_id = 37
We're in a world of hurt because Oracle is now going to join every row in rooms with every row in reservations where the user_id matches that of the logged-in user.

 

Step 2: add a constraint to the WHERE clause


select room_id, start_time, end_time
from reservations, rooms
where user_id = 37
and reservations.room_id = rooms.room_id

 

Step 3: look for ambiguously defined columns

Both reservations and rooms contain columns called "room_id". So we need to prefix the room_id column in the SELECT list with "reservations.". Note that we don't have to prefix start_time and end_time because these columns are only present in reservations.

select reservations.room_id, start_time, end_time
from reservations, rooms
where user_id = 37
and reservations.room_id = rooms.room_id

 

Step 4: test

Test the query to make sure that you haven't broken anything. You should get back the same rows with the same columns as before.

 

Step 5: add a new column to the SELECT list

We're finally ready to do what we set out to do: add room_name to the list of columns for which we're querying.

select reservations.room_id, start_time, end_time, rooms.room_name
from reservations, rooms
where user_id = 37
and reservations.room_id = rooms.room_id

 

 

 

Reference




 

 

---

based on  SQL for Web Nerds

Transactions

Created by Anett Szabo, last modified by Anett Szabo 25 Jul 2007, at 04:32 PM

In the introduction we covered some examples of inserting data into a database by typing at SQL*Plus:


insert into mailing_list (name, email)
values ('Philip Greenspun','philg@mit.edu');
Generally, this is not how it is done. As a programmer, you write code that gets executed every time a user submits a discussion forum posting or classified ad. The structure of the SQL statement remains fixed but not the string literals after the values.

The simplest and most direct interface to a relational database involves a procedural program in C, Java, Lisp, Perl, or Tcl putting together a string of SQL that is then sent to to the RDBMS. Here's how the ArsDigita Community System constructs a new entry in the clickthrough log:


insert into clickthrough_log
(local_url, foreign_url, entry_date, click_count)
values
('$local_url', '$foreign_url', trunc(sysdate), 1)"
The INSERT statement adds one row, filling in the four list columns. Two of the values come from local variables set within the Web server, $local_url and $foreign_url. Because these are strings, they must be surrounded by single quotes. One of the values is dynamic and comes straight from Oracle: trunc(sysdate). Recall that the date data type in Oracle is precise to the second. We only want one of these rows per day of the year and hence truncate the date to midnight. Finally, as this is the first clickthrough of the day, we insert a constant value of 1 for click_count.

Atomicity

Each SQL statement executes as an atomic transaction. For example, suppose that you were to attempt to purge some old data with

delete from clickthrough_log where entry_date + 120 < sysdate;
(delete clickthrough records more than 120 days old) and that 3500 rows in clickthrough_log are older than 120 days. If your computer failed halfway through the execution of this DELETE, i.e., before the transaction committed, you would find that none of the rows had been deleted. Either all 3500 rows will disappear or none will.

More interestingly, you can wrap a transaction around multiple SQL statements. For example, when a user is editing a comment, the ArsDigita Community System keeps a record of what was there before:


ns_db dml $db "begin transaction"

# insert into the audit table
ns_db dml $db "insert into general_comments_audit
(comment_id, user_id, ip_address, audit_entry_time, modified_date, content)
select comment_id,
user_id,
'[ns_conn peeraddr]',
sysdate,
modified_date,
content from general_comments
where comment_id = $comment_id"

# change the publicly viewable table
ns_db dml $db "update general_comments
set content = '$QQcontent',
html_p = '$html_p'
where comment_id = $comment_id"

# commit the transaction
ns_db dml $db "end transaction"
This is generally referred to in the database industry as auditing. The database itself is used to keep track of what has been changed and by whom.

Let's look at these sections piece by piece. We're looking at a Tcl program calling AOLserver API procedures when it wants to talk to Oracle. We've configured the system to reverse the normal Oracle world order in which everything is within a transaction unless otherwise committed. The begin transaction and end transaction statements never get through to Oracle; they are merely instructions to our Oracle driver to flip Oracle out and then back into autocommit mode.

The transaction wrapper is imposed around two SQL statements. The first statement inserts a row into general_comments_audit. We could simply query the general_comments table from Tcl and then use the returned data to create a standard-looking INSERT. However, if what you're actually doing is moving data from one place within the RDBMS to another, it is extremely bad taste to drag it all the way out to an application program and then stuff it back in. Much better to use the "INSERT ... SELECT" form.

Note that two of the columns we're querying from general_comments don't exist in the table: sysdate and '[ns_conn peeraddr]'. It is legal in SQL to put function calls or constants in your select list, just as you saw at the beginning of the Queries chapter where we discussed Oracle's one-row system table: dual. To refresh your memory:


select sysdate from dual;

SYSDATE
----------
1999-01-14
You can compute multiple values in a single query:

select sysdate, 2+2, atan2(0, -1) from dual;

SYSDATE 2+2 ATAN2(0,-1)
---------- ---------- -----------
1999-01-14 4 3.14159265
This approach is useful in the transaction above, where we combine information from a table with constants and function calls. Here's a simpler example:

select posting_time, 2+2
from bboard
where msg_id = '000KWj';

POSTING_TI 2+2
---------- ----------
1998-12-13 4

Let's get back to our comment editing transaction and look at the basic structure:
  • open a transaction
  • insert into an audit table whatever comes back from a SELECT statement on the comment table
  • update the comment table
  • close the transaction
Suppose that something goes wrong during the INSERT. The tablespace in which the audit table resides is full and it isn't possible to add a row. Putting the INSERT and UPDATE in the same RDBMS transactions ensures that if there is a problem with one, the other won't be applied to the database.

Consistency

Suppose that we've looked at a message on the bulletin board and decide that its content is so offensive we wish to delete the user from our system:

select user_id from bboard where msg_id = '000KWj';

USER_ID
----------
39685

delete from users where user_id = 39685;
*
ERROR at line 1:
ORA-02292: integrity constraint (PHOTONET.SYS_C001526) violated - child record
found
Oracle has stopped us from deleting user 39685 because to do so would leave the database in an inconsistent state. Here's the definition of the bboard table:

create table bboard (
msg_id char(6) not null primary key,
refers_to char(6),
...
user_id integer not null references users,
one_line varchar(700),
message clob,
...
);
The user_id column is constrained to be not null. Furthermore, the value in this column must correspond to some row in the users table (references users). By asking Oracle to delete the author of msg_id 000KWj from the users table before we deleted all of his or her postings from the bboard table, we were asking Oracle to leave the RDBMS in an inconsistent state.

Mutual Exclusion

  When you have multiple simultaneously executing copies of the same program, you have to think about mutual exclusion. If a program has to
  • read a value from the database
  • perform a computation based on that value
  • update the value in the database based on the computation
Then you want to make sure only one copy of the program is executing at a time through this segment.

The /bboard module of the ArsDigita Community System has to do this. The sequence is

  • read the last message ID from the msg_id_generator table
  • increment the message ID with a bizarre collection of Tcl scripts
  • update the last_msg_id column in the msg_id_generator table
First, anything having to do with locks only makes sense when the three operations are grouped together in a transaction. Second, to avoid deadlocks a transaction must acquire all the resources (including locks) that it needs at the start of the transaction. A SELECT in Oracle does not acquire any locks but a SELECT .. FOR UPDATE does. Here's the beginning of the transaction that inserts a message into the bboard table (from /bboard/insert-msg.tcl):

select last_msg_id
from msg_id_generator
for update of last_msg_id

Mutual Exclusion (the Big Hammer)

The for update clause isn't a panacea. For example, in the Action Network (described in Chapter 16 of Philip and Alex's Guide to Web Publishing), we need to make sure that a double-clicking user doesn't generate duplicate FAXes to politicians. The test to see if the user has already responded is

select count(*) from an_alert_log
where member_id = $member_id
and entry_type = 'sent_response'
and alert_id = $alert_id
By default, Oracle locks one row at a time and doesn't want you to throw a FOR UPDATE clause into a SELECT COUNT(*). The implication of that would be Oracle recording locks on every row in the table. Much more efficient is simply to start the transaction with

lock table an_alert_log in exclusive mode
This is a big hammer and you don't want to hold a table lock for more than an instant. So the structure of a page that gets a table lock should be
  • open a transaction
  • lock table
  • select count(*)
  • if the count was 0, insert a row to record the fact that the user has responded
  • commit the transaction (releases the table lock)
  • proceed with the rest of the script
  • ...

What if I just want some unique numbers?

Does it really have to be this hard? What if you just want some unique integers, each of which will be used as a primary key? Consider a table to hold news items for a Web site:

create table news (
title varchar(100) not null,
body varchar(4000) not null,
release_date date not null,
...
);
You might think you could use the title column as a key, but consider the following articles:

insert into news (title, body, release_date)
values
('French Air Traffic Controllers Strike',
'A walkout today by controllers left travelers stranded..',
'1995-12-14');

insert into news (title, body, release_date)
values
('French Air Traffic Controllers Strike',
'Passengers at Orly faced 400 canceled flights ...',
'1997-05-01');

insert into news (title, body, release_date)
values
('Bill Clinton Beats the Rap',
'Only 55 senators were convinced that President Clinton obstructed justice ...',
'1999-02-12');

insert into news (title, body, release_date)
values
('Bill Clinton Beats the Rap',
'The sexual harassment suit by Paula Jones was dismissed ...',
'1998-12-02);
It would seem that, at least as far as headlines are concerned, little of what is reported is truly new. Could we add
	primary key (title, release_date)
at the end of our table definition? Absolutely. But keying by title and date would result in some unwieldy URLs for editing or approving news articles. If your site allows public suggestions, you might find submissions from multiple users colliding. If you accept comments on news articles, a standard feature of the ArsDigita Community System, each comment must reference a news article. You'd have to be sure to update both the comments table and the news table if you needed to correct a typo in the title column or changed the release_date.

The traditional database design that gets around all of these problems is the use of a generated key. If you've been annoyed by having to carry around your student ID at MIT or your patient ID at a hospital, now you understand the reason why: the programmers are using generated keys and making their lives a bit easier by exposing this part of their software's innards.

Here's how the news module of the ArsDigita Community System works, in an excerpt from http://software.arsdigita.com/www/doc/sql/news.sql:


create sequence news_id_sequence start with 1;

create table news (
news_id integer primary key,
title varchar(100) not null,
body varchar(4000) not null,
release_date date not null,
...
);
We're taking advantage of the nonstandard but very useful Oracle sequence facility. In almost any Oracle SQL statement, you can ask for a sequence's current value or next value.

SQL> create sequence foo_sequence;

Sequence created.

SQL> select foo_sequence.currval from dual;

ERROR at line 1:
ORA-08002: sequence FOO_SEQUENCE.CURRVAL is not yet defined in this session
Oops! Looks like we can't ask for the current value until we've asked for at least one key in our current session with Oracle.

SQL> select foo_sequence.nextval from dual;

NEXTVAL
----------
1

SQL> select foo_sequence.nextval from dual;

NEXTVAL
----------
2

SQL> select foo_sequence.nextval from dual;

NEXTVAL
----------
3

SQL> select foo_sequence.currval from dual;

CURRVAL
----------
3
You can use the sequence generator directly in an insert, e.g.,

insert into news (news_id, title, body, release_date)
values
(news_id_sequence.nextval,
'Tuition Refund at MIT',
'Administrators were shocked and horrified ...',
'1998-03-12);
Background on this story: http://philip.greenspun.com/school/tuition-free-mit.html

In the ArsDigita Community System implementation, the news_id is actually generated in /news/post-new-2.tcl:


set news_id [database_to_tcl_string $db "select news_id_sequence.nextval from dual"]
This way the page that actually does the database insert, /news/post-new-3.tcl, can be sure when the user has inadvertently hit submit twice:

if [catch { ns_db dml $db "insert into news
(news_id, title, body, html_p, approved_p,
release_date, expiration_date, creation_date, creation_user,
creation_ip_address)
values
($news_id, '$QQtitle', '$QQbody', '$html_p', '$approved_p',
'$release_date', '$expiration_date', sysdate, $user_id,
'$creation_ip_address')" } errmsg] {
# insert failed; let's see if it was because of duplicate submission
if { [database_to_tcl_string $db "select count(*)
from news
where news_id = $news_id"] == 0 } {
# some error other than dupe submission
ad_return_error "Insert Failed" "The database ..."
return
}
# we don't bother to handle the cases where there is a dupe submission
# because the user should be thanked or redirected anyway
}
In our experience, the standard technique of generating the key at the same time as the insert leads to a lot of duplicate information in the database.

Sequence Caveats

Oracle sequences are optimized for speed. Hence they offer the minimum guarantees that Oracle thinks are required for primary key generation and no more.

If you ask for a few nextvals and roll back your transaction, the sequence will not be rolled back.

You can't rely on sequence values to be, uh, sequential. They will be unique. They will be monotonically increasing. But there might be gaps. The gaps arise because Oracle pulls, by default, 20 sequence values into memory and records those values as used on disk. This makes nextval very fast since the new value need only be marked use in RAM and not on disk. But suppose that someone pulls the plug on your database server after only two sequence values have been handed out. If your database administrator and system administrator are working well together, the computer will come back to life running Oracle. But there will be a gap of 18 values in the sequence (e.g., from 2023 to 2041). That's because Oracle recorded 20 values used on disk and only handed out 2.

More

 

 

---

based on  SQL for Web Nerds

Views

Created by Anett Szabo, last modified by Anett Szabo 25 Jul 2007, at 04:31 PM

The relational database provides programmers with a high degree of abstraction from the physical world of the computer. You can't tell where on the disk the RDBMS is putting each row of a table. For all you know, information in a single row might be split up and spread out across multiple disk drives. The RDBMS lets you add a column to a billion-row table. Is the new information for each row going to be placed next to the pre-existing columns or will a big new block of disk space be allocated to hold the new column value for all billion rows? You can't know and shouldn't really care.

A view is a way of building even greater abstraction.

Suppose that Jane in marketing says that she wants to see a table containing the following information:

  • user_id
  • email address
  • number of static pages viewed
  • number of bboard postings made
  • number of comments made
This information is spread out among four tables. However, having read the preceding chapters of this book, you're perfectly equipped to serve Jane's needs with the following query:
select u.user_id, 
u.email,
count(ucm.page_id) as n_pages,
count(bb.msg_id) as n_msgs,
count(c.comment_id) as n_comments
from users u, user_content_map ucm, bboard bb, comments c
where u.user_id = ucm.user_id(+)
and u.user_id = bb.user_id(+)
and u.user_id = c.user_id(+)
group by u.user_id, u.email
order by upper(email)
Then Jane adds "I want to see this every day, updated with the latest information. I want to have a programmer write me some desktop software that connects directly to the database and looks at this information; I don't want my desktop software breaking if you reorganize the data model."
create or replace view janes_marketing_view 
as
select u.user_id,
u.email,
count(ucm.page_id) as n_pages,
count(bb.msg_id) as n_msgs,
count(c.comment_id) as n_comments
from users u, user_content_map ucm, bboard bb, comments c
where u.user_id = ucm.user_id(+)
and u.user_id = bb.user_id(+)
and u.user_id = c.user_id(+)
group by u.user_id, u.email
order by upper(u.email)
To Jane, this will look and act just like a table when she queries it:

select * from janes_marketing_view;
Why should she need to be aware that information is coming from four tables? Or that you've reorganized the RDBMS so that the information subsequently comes from six tables?

Protecting Privacy with Views

A common use of views is protecting confidential data. For example, suppose that all the people who work in a hospital collaborate by using a relational database. Here is the data model:

create table patients (
patient_id integer primary key,
patient_name varchar(100),
hiv_positive_p char(1),
insurance_p char(1),
...
);
If a bunch of hippie idealists are running the hospital, they'll think that the medical doctors shouldn't be aware of a patient's insurance status. So when a doc is looking up a patient's medical record, the looking is done through

create view patients_clinical
as
select patient_id, patient_name, hiv_positive_p from patients;
The folks over in accounting shouldn't get access to the patients' medical records just because they're trying to squeeze money out of them:

create view patients_accounting
as
select patient_id, patient_name, insurance_p from patients;
Relational databases have elaborate permission systems similar to those on time-shared computer systems. Each person in a hospital has a unique database user ID. Permission will be granted to view or modify certain tables on a per-user or per-group-of-users basis. Generally the RDBMS permissions facilities aren't very useful for Web applications. It is the Web server that is talking to the database, not a user's desktop computer. So the Web server is responsible for figuring out who is requesting a page and how much to show in response.

Protecting Your Own Source Code

The ArsDigita Shoppe system, described in http://philip.greenspun.com/panda/ecommerce, represents all orders in one table, whether they were denied by the credit card processor, returned by the user, or voided by the merchant. This is fine for transaction processing but you don't want your accounting or tax reports corrupted by the inclusion of failed orders. You can make a decision in one place as to what constitutes a reportable order and then have all of your report programs query the view:

create or replace view sh_orders_reportable
as
select * from sh_orders
where order_state not in ('confirmed','failed_authorization','void');
Note that in the privacy example (above) we were using the view to leave unwanted columns behind whereas here we are using the view to leave behind unwanted rows.

If we add some order states or otherwise change the data model, the reporting programs need not be touched; we only have to keep this view definition up to date. Note that you can define every view with "create or replace view" rather than "create view"; this saves a bit of typing when you have to edit the definition later.

If you've used select * to define a view and subsequently alter any of the underlying tables, you have to redefine the view. Otherwise, your view won't contain any of the new columns. You might consider this a bug but Oracle has documented it, thus turning the behavior into a feature.

Views-on-the-fly and OUTER JOIN

Let's return to our first OUTER JOIN example, from the simple queries chapter:

select users.user_id, users.email, classified_ads.posted
from users, classified_ads
where users.user_id = classified_ads.user_id(+)
order by users.email, posted;

...
USER_ID EMAIL POSTED
---------- ----------------------------------- ----------
52790 dbrager@mindspring.com
37461 dbraun@scdt.intel.com
52791 dbrenner@flash.net
47177 dbronz@free.polbox.pl
37296 dbrouse@enter.net
47178 dbrown@cyberhighway.net
36985 dbrown@uniden.com 1998-03-05
36985 dbrown@uniden.com 1998-03-10
34283 dbs117@amaze.net
52792 dbsikorski@yahoo.com
...

The plus sign after classified_ads.user_id is our instruction to Oracle to "add NULL rows if you can't meet this JOIN constraint".

Suppose that this report has gotten very long and we're only interested in users whose email addresses start with "db". We can add a WHERE clause constraint on the email column of the users table:


select users.user_id, users.email, classified_ads.posted
from users, classified_ads
where users.user_id = classified_ads.user_id(+)
and users.email like 'db%'
order by users.email, posted;

USER_ID EMAIL POSTED
---------- ------------------------------ ----------
71668 db-designs@emeraldnet.net
112295 db1@sisna.com
137640 db25@umail.umd.edu
35102 db44@aol.com 1999-12-23
59279 db4rs@aol.com
95190 db@astro.com.au
17474 db@hotmail.com
248220 db@indianhospitality.com
40134 db@spindelvision.com 1999-02-04
144420 db_chang@yahoo.com
15020 dbaaru@mindspring.com
...
Suppose that we decide we're only interested in classified ads since January 1, 1999. Let's try the naive approach, adding another WHERE clause constraint, this time on a column from the classified_ads table:

select users.user_id, users.email, classified_ads.posted
from users, classified_ads
where users.user_id = classified_ads.user_id(+)
and users.email like 'db%'
and classified_ads.posted > '1999-01-01'
order by users.email, posted;

USER_ID EMAIL POSTED
---------- ------------------------------ ----------
35102 db44@aol.com 1999-12-23
40134 db@spindelvision.com 1999-02-04
16979 dbdors@ev1.net 2000-10-03
16979 dbdors@ev1.net 2000-10-26
235920 dbendo@mindspring.com 2000-08-03
258161 dbouchar@bell.mma.edu 2000-10-26
39921 dbp@agora.rdrop.com 1999-06-03
39921 dbp@agora.rdrop.com 1999-11-05

8 rows selected.
Hey! This completely wrecked our outer join! All of the rows where the user had not posted any ads have now disappeared. Why? They didn't meet the and classified_ads.posted > '1999-01-01' constraint. The outer join added NULLs to every column in the report where there was no corresponding row in the classified_ads table. The new constraint is comparing NULL to January 1, 1999 and the answer is... NULL. That's three-valued logic for you. Any computation involving a NULL turns out NULL. Each WHERE clause constraint must evaluate to true for a row to be kept in the result set of the SELECT. What's the solution? A "view on the fly". Instead of OUTER JOINing the users table to the classified_ads, we will OUTER JOIN users to a view of classified_ads that contains only those ads posted since January 1, 1999:

select users.user_id, users.email, ad_view.posted
from
users,
(select *
from classified_ads
where posted > '1999-01-01') ad_view

where users.user_id = ad_view.user_id(+)
and users.email like 'db%'
order by users.email, ad_view.posted;

USER_ID EMAIL POSTED
---------- ------------------------------ ----------
71668 db-designs@emeraldnet.net
112295 db1@sisna.com
137640 db25@umail.umd.edu
35102 db44@aol.com 1999-12-23
59279 db4rs@aol.com
95190 db@astro.com.au
17474 db@hotmail.com
248220 db@indianhospitality.com
40134 db@spindelvision.com 1999-02-04
144420 db_chang@yahoo.com
15020 dbaaru@mindspring.com
...
174 rows selected.
Note that we've named our "view on the fly" ad_view for the duration of this query.

How Views Work

Programmers aren't supposed to have to think about how views work. However, it is worth noting that the RDBMS merely stores the view definition and not any of the data in a view. Querying against a view versus the underlying tables does not change the way that data are retrieved or cached. Standard RDBMS views exist to make programming more convenient or to address security concerns, not to make data access more efficient.

How Materialized Views Work

 
Starting with Oracle 8.1.5, introduced in March 1999, you can have a materialized view, also known as a summary. Like a regular view, a materialized view can be used to build a black-box abstraction for the programmer. In other words, the view might be created with a complicated JOIN, or an expensive GROUP BY with sums and averages. With a regular view, this expensive operation would be done every time you issued a query. With a materialized view, the expensive operation is done when the view is created and thus an individual query need not involve substantial computation.

Materialized views consume space because Oracle is keeping a copy of the data or at least a copy of information derivable from the data. More importantly, a materialized view does not contain up-to-the-minute information. When you query a regular view, your results includes changes made up to the last committed transaction before your SELECT. When you query a materialized view, you're getting results as of the time that the view was created or refreshed. Note that Oracle lets you specify a refresh interval at which the materialized view will automatically be refreshed.

At this point, you'd expect an experienced Oracle user to say "Hey, these aren't new. This is the old CREATE SNAPSHOT facility that we used to keep semi-up-to-date copies of tables on machines across the network!" What is new with materialized views is that you can create them with the ENABLE QUERY REWRITE option. This authorizes the SQL parser to look at a query involving aggregates or JOINs and go to the materialized view instead. Consider the following query, from the ArsDigita Community System's /admin/users/registration-history.tcl page:


select
to_char(registration_date,'YYYYMM') as sort_key,
rtrim(to_char(registration_date,'Month')) as pretty_month,
to_char(registration_date,'YYYY') as pretty_year,
count(*) as n_new
from users
group by
to_char(registration_date,'YYYYMM'),
to_char(registration_date,'Month'),
to_char(registration_date,'YYYY')
order by 1;


SORT_K PRETTY_MO PRET N_NEW
------ --------- ---- ----------
199805 May 1998 898
199806 June 1998 806
199807 July 1998 972
199808 August 1998 849
199809 September 1998 1023
199810 October 1998 1089
199811 November 1998 1005
199812 December 1998 1059
199901 January 1999 1488
199902 February 1999 2148
For each month, we have a count of how many users registered at photo.net. To execute the query, Oracle must sequentially scan the users table. If the users table grew large and you wanted the query to be instant, you'd sacrifice some timeliness in the stats with

create materialized view users_by_month
enable query rewrite
refresh complete
start with 1999-03-28
next sysdate + 1
as
select
to_char(registration_date,'YYYYMM') as sort_key,
rtrim(to_char(registration_date,'Month')) as pretty_month,
to_char(registration_date,'YYYY') as pretty_year,
count(*) as n_new
from users
group by
to_char(registration_date,'YYYYMM'),
to_char(registration_date,'Month'),
to_char(registration_date,'YYYY')
order by 1
Oracle will build this view just after midnight on March 28, 1999. The view will be refreshed every 24 hours after that. Because of the enable query rewrite clause, Oracle will feel free to grab data from the view even when a user's query does not mention the view. For example, given the query

select count(*)
from users
where rtrim(to_char(registration_date,'Month')) = 'January'
and to_char(registration_date,'YYYY') = '1999'
Oracle would ignore the users table altogether and pull information from users_by_month. This would give the same result with much less work. Suppose that the current month is March 1999, though. The query

select count(*)
from users
where rtrim(to_char(registration_date,'Month')) = 'March'
and to_char(registration_date,'YYYY') = '1999'
will also hit the materialized view rather than the users table and hence will miss anyone who has registered since midnight (i.e., the query rewriting will cause a different result to be returned).

More:

Reference

 

 

 

---

based on  SQL for Web Nerds

Style

Created by Anett Szabo, last modified by Anett Szabo 25 Jul 2007, at 04:31 PM

Here's a familiar simple example from the complex queries chapter:


select user_id,
count(*) as how_many from bboard
where not exists (select 1 from bboard_authorized_maintainers
bam where bam.user_id =
bboard.user_id) and
posting_time + 60 > sysdate group by user_id order
by how_many desc;
Doesn't seem so simple, eh? How about if we rewrite it:

select user_id, count(*) as how_many
from bboard
where not exists (select 1 from
bboard_authorized_maintainers bam
where bam.user_id = bboard.user_id)
and posting_time + 60 > sysdate
group by user_id
order by how_many desc;
If your code isn't properly indented then you will never be able to debug it. How can we justify using the word "properly"? After all, the SQL parser doesn't take extra spaces or newlines into account.

Software is indented properly when the structure of the software is revealed and when the indentation style is familiar to a community of programmers.

Rules for All Queries

If it fits on one line, it is okay to leave on one line:

select email from users where user_id = 34;
If it doesn't fit nicely on one line, give each clause a separate line:

select *
from news
where sysdate > expiration_date
and approved_p = 't'
order by release_date desc, creation_date desc
If the stuff for a particular clause won't fit on one line, put a newline immediately after the keyword that opens the clause. Then indent the items underneath. Here's an example from the ArsDigita Community System's static .html page administration section. We're querying the static_pages table, which holds a copy of any .html files in the Unix file system:

select
to_char(count(*),'999G999G999G999G999') as n_pages,
to_char(sum(dbms_lob.getlength(page_body)),'999G999G999G999G999') as n_bytes
from static_pages;
In this query, there are two items in the select list, a count of all the rows and a sum of the bytes in the page_body column (of type CLOB, hence the requirement to use dbms_lob.getlength rather than simply length). We want Oracle to format these numbers with separation characters between every three digits. For this, we have to use the to_char function and a mask of '999G999G999G999G999' (the "G" tells Oracle to use the appropriate character depending on the country where it is installed, e.g., comma in the U.S. and period in Europe). Then we have to give the results correlation names so that they will be easy to use as Tcl variables. By the time we're done with all of this, it would be confusing to put both items on the same line.

Here's another example, this time from the top-level comment administation page for the ArsDigita Community System. We're going to get back a single row with a count of each type of user-submitted comment:


select
count(*) as n_total,
sum(decode(comment_type,'alternative_perspective',1,0)) as n_alternative_perspectives,
sum(decode(comment_type,'rating',1,0)) as n_ratings,
sum(decode(comment_type,'unanswered_question',1,0)) as n_unanswered_questions,
sum(decode(comment_type,'private_message_to_page_authors',1,0)) as n_private_messages
from comments
Notice the use of sum(decode to count the number of each type of comment. This gives us similar information to what we'd get from a GROUP BY, but we get a sum total as well as category totals. Also, the numbers come out with the column names of our choice. Of course, this kind of query only works when you know in advance the possible values of comment_type.

Rules for GROUP BY queries

When you're doing a GROUP BY, put the columns that determine the group identity first in the select list. Put the aggregate columns that compute a function for that group afterwards:

select links.user_id, first_names, last_name, count(links.page_id) as n_links
from links, users
where links.user_id = users.user_id
group by links.user_id, first_names, last_name
order by n_links desc

 

 

---

based on  SQL for Web Nerds

Next Page