2.2 SQL (Structured Query Language)

SQL was originated in 1970’s in IBM’s San Jose laboratory where it was used as the relational database language for System-R DBMS [1 pg 6].  Successful outcome of this project at 1979, made SQL becoming popular among many RDBMS vendors.  As a result many relational DBMS came to the market such as Oracle by Relational Software (1979), INGRES from Rational Technology (1981), IDM from Britton-Lee (1982), DG/SQL from Data General Corporation (1984) and Sybase from Sybase Inc [2 pg 324].

SQL received certification from American National Standard Institute (ANSI) and from International Standard Organisation (ISO) in 1986.  This SQL-86 standard was extended further to create the new SQL-92 standard in 1992 [2 pg 325].  There are number of advantages in using a standardised relational language such as reduced training costs, application portability, application longality, reduced dependence on a single vendor and cross-system communication [2 pg 326].  Disadvantages of such language are loss of application portability due to addition of special SQL features by the vendor and also it is difficult to fix any deficiencies in the standard.

There are three types of SQL commands [2 pg 328].

  1. Data definition language (DDL) - these commands are issued to create, alter or delete tables.  Usually these commands can be executed by database administrator only to prevent any accidental or deliberate damage to the database.
  2. Data manipulation language (DML) - these commands are used to insert, update, delete or query data from tables.
  3. Data control language (DCL) - these commands are used to grant various access privileges of the database structure to users.  Only the database administrator can execute these commands.

SQL syntax is a broader topic to be covered in this report and most of it is not relevant to the SQ-HAL project.  Only database querying is relevant to this project and this will be covered in the next section.

2.2.1 Database querying using SQL (The syntax)

SQL commands (or commonly referred to as SQL statements) are issued to DBMS, which then execute these commands and return the results (if any).  In order to query or retrieve some specific data from the database, SELECT statement has to be issued.  The syntax for SELECT statement is,

SELECT column_list
FROM table_list
[WHERE conditional_expression]
[GROUP BY group_by_column_lis]
[HAVING conditional_expression]
[ORDER BY order_by_column_list]

The expressions in square brackets are optional and the words in capital letters are the SQL keywords.  All the lists in the statement are comma separated.  Column names and table names are usually case sensitive.  The most common SQL statements involve only the SELECT, FROM and WHERE clauses.

Columns_list in SELECT clause

This list contains what columns to be displayed in the results.  The asterisk (*) represents all the columns.  If the tables list contain more than one table, then each column needs to be referred to as <table_name>.<column_name> format.

Following examples use the sample database shown below.

Figure 2: Sample database structure
[Sample database structure]

Examples -

show all columns in suppliers table
    SELECT * FROM suppliers

show the ‘name’ and ‘address’ columns from the suppliers table
    SELECT name, address FROM suppliers

show the ‘name’ column from suppliers table and ‘description’ column from products table.
    SELECT suppliers.name, products.description FROM suppliers, products

Columns list can also contain expressions or functions.  Expressions have the format of:
<column_name> <operator> <column_name>  AS  <name_for_the_returned_column>

Example -

SELECT description, quantity * cost AS total_value
FROM products

SQL allows many different functions and most common of these functions are COUNT, MIN, MAX, SUM and AVG.

Table 1: SQL Functions
Function Description
COUNT(* | <column_name>) Count number of rows in the return results
MIN(<column_name>) Minimum value in the specified column
MAX(<column_name>) Maximum value in the specified column
SUM(<column_name>) Sum of all the values in the specified column
AVG(<column_name>) Average of all the value in the specified column

Examples -

show total of the column quantity in products table
    SELECT SUM(quantity) FROM products

show number of rows in the suppliers table
    SELECT COUNT(*) FROM suppliers

show the minimum and maximum cost from the products table
    SELECT MIN(cost) AS min_cost, MAX(cost) AS max_cost
    FROM products

The word DISTINCT is used to filter out any repeated rows in the returned results and therefore all there returned rows would be unique.

Conditional expression (in WHERE clause)

The conditional expression restricts what rows are considered in the final results.  Expressions have the format,
    <column_name> <operator> <value>

List of valid operators allowed in these conditional expressions are shown below.

Table 2: List of valid SQL operators
Operator Description Example
= Equal to cost = 10
> Greater than cost > 10
>= Greater than or equal to cost >= 10
< Less than cost < 10
<= Less than or equal to cost <= 10
<> Not equal to cost <> 10
[NOT] BETWEEN .. AND Between the given two values cost BETWEEN 10 AND 20
IS [NOT] NULL Values that are null (or contain no value) description IS NULL
[NOT] IN Values in the given list supplier IN (‘ABC’, ‘XYZ’)
[NOT] LIKE Wildcard matching of values.  Allowed wildcards are underscore (_) to be matched with exactly one character and percentage sign (%) to match any number of characters description  LIKE ‘_BC%’

Number of these expressions can be joined by Boolean operators AND or OR.  Expressions can be negated by the Boolean operator NOT.

Example -    cost > 10 AND cost < 20

GROUP BY and HAVING clauses are used to divide tables into subsets and perform complex operations.

The ORDER BY clause specifies how the final results should be ordered.  (I.e. by what columns and in what order.)  The general format is,
   ORDER BY <column_name> [ASC|DESC]
ASC
- ascending order (default setting)
DESC
- descending order

Example -

Order the content of the product table first by ascending order of description and then by descending order or cost.
   SELECT * FROM products
   ORDER BY description, cost DESC

Theory - Parsing