Lecture 1
Course’s aim:
- Introdution for coursees
A CS Store from Liverpool to national
Lecture 2
Introduction to databases:
Databases abstractly
Databases can be used for most purposes in computer science (Python, Java, C or C++)
Databases are not general-purpose, they are domain-specific
(It’s everywhere !)Databases, in a nutshell, are for storing, searching and aggregating information/data
Databases Drives the Web
User -> Web service -> Database
- Requires extremely efficient systems for querying dada,
and for storing and mainpulating data in a safe way.
Application
- Online store
- Social media site
- Airline/train/hotel/reservation
- many more
Databases Drives Businesses
- Banking:
- debit/ credit card transactons
- Stock market
- Big companies & organisations:
- Employees, customers, products, sales, schedules, logistics, clients, etc
- Purposes:
- Some are worthwhile directly
- Management, data analytics and decision making
Databases in Science & Public Service
- Science:
- Astronomy
- Human Genome
- Biochemists exploring properties of proteins
- Biology and life sciences
- Geology
- Medicine/NHS:
- Patinet records
- Diseases and their relationaships, treaments, etc.
- Decision making
- Many otheres
Lecture 3
Basics of SQL and the relational model
- insert
- delete
- query
Relational DBMS Components
User/application
- QUery Compiler
- Exection Engine
- index/file/record manager
Transaction Management
- Transaction manager
- Logging and recovery
- Concurrency Control
Beyond Plain Relational DBMS
- Distributed DBMSs 分散数据库
- Beyond Relational data
- data warehouses/analysis/mining
Lecture 4 Basic SQL
Introduction to relational databases and SQL
Relational database
- Data is organised in tables
- Each table has a schema => Items(name, price, number)
In typical use(Tables)
Each table typically has:
- Few columns/attributes that are fixed(i.e. in name and type of content)
- The number of rows depends on the table and can change
(Rows)
Each row corresponds to an entity or similar, or to relationship between such
- Rows in the same table corresponds to the same kind of entity or relation
- A row in a table could correspond to a transaction
- A sidenote
SQL
- Relational databases are accessed using SQL(Standard Query Language)
- The standard for SQL is updated every few years
- The implementations does not follow the standard that well though
SQL Parts
- Data Definition Language(DDL)
- Create/alter/delete databases, tables and their attributes
- Data Manipulation Language(DML)
- Add/remove/update and query rows in tables
- Transact-SQL
- ntuitively, do a sequence of SQL statements
SQL DDL
- SQL DDL
- Create a database: CREATE DATABASE nnnnnn; (不区分大小写)
- CREATE TABLE Table_name (column1 datatype,column2 datatype,column3 datatype);
(换行操作和线性同样) - Datatypes
- Unique specify that you want some attribute to be unique
- Primary Key: CONSTRAINT PK_tablenames PRIMARY KEY (attributes)
data should be sorted physically in storage by defining a primary key
- Primary keys must be unique and can ony be 1
- Primary key per table(there can be many unique attributes/set of attributes)
- Primary key are often id number, e.g. student id/employee id or similar
- Foreign key: CONSTRAINT FK_kidtablename FOREIGN KEY(attributes) REFERENCES Fathertablename(attributes)
- A foreign key is used to link two tables together explicitly
- 从一个table的一些attributes指向另外的table的primary key (子表的某些属性指向主键)
- It ensured that the values in the attributes with the foreign key in the child table must also be in the parent table
- Can be used to create custom datatypes, by having a table with all the values you want to allow
- A foreign key is used to link two tables together explicitly
- DROP : DROP DATABASE database_name; to removes the current database
*DROP TABLE tablename* to removes the current table
- Modifying tables:
- ALTER TABLE tablenames ADD attribute datatype : Adds an attribute
- ALTER TABLE tablenames MODIFY attribute datatype : Changes the current attribute
- ALTER TABLE tablenames DROP COLUMN attribute : Removes the email attribute again
SQL DML
- SQL DML
- Insert: INSERT INTO students VALUES(‘Oliver’, 20171112, ‘G402’);
- Delete: DELETE FROM Students WHERE name = ‘john’;
- Conditions in WHERE clauses
- Comparisons: =, <, <=, >=, <>(or != for the last)
- Conditions can contain:
- AND
- OR
- NOT
- BETWEEN
- LIKE: for string matching
- “_” for any 1 letter
- “%” for any number of letters
- UPDATE: UPDATE studnets SET programme = ‘G402’ WHERE name = ‘Oliver’;
operation details:
- 插入新数据要有主键
- 外键要和主键对应
- 不能超过规定长度
- 变量名字不能错误
Lce-5 SQL qureries - required part
SQL Queries:
Queries in SQL have the following form:
Most basic query
- SELECT
SELECT defines what is outputted, making you able to do four kinds of modifications of it:
这里星号表示查询出表里面的全部字段
- Projection($\pi$)
- DISTINCT
- Renaming($\rho$)
- Creating new columns
Projection($\pi$)
Projection allows you to select attributes you want to keep(the rest qre discarded)
You use it by writing a list of attributes
变量的顺序很重要
DISTINCT
DISTINCT is for remobing duplicated rows
Renaming($\rho$)
Renaming allows you to rename attributes
You use it by writing AS and then the new name after the attribute
Creating new columns
Mix and match
The four modification types can also be mixed and matched as wanted
FROM
FORM defines the input and would be easy if we only allowed one input
In general, FROM can cotain many input tables and we combine them together in various ways
- How we combine the tables is defined by FROM
Only some ways of combining will be discussed in this course, but there exists others
The primarily used ones will be:
Cross product(X)
Natural join
UA 和 UB 的并集
Lec 6 SQL queries (optional part)
Conetent:
- WHERE
- GROUP BY
- HAVING
- ORDER BY
WHERE
- WHERE:IN
the example used was:
However,instead of providing a list, one can also do a sub-query (here using a similar University database as in the example above):
- WHERE: EXISTS
EXISTS is a generalization of IN
- WHERE: Semi-join
Most types of joins done in the FROM clause
- See the video on the required part of SQL queries
There is q exception though, which is semi-joins, which are done in WHERE clauses
A(left)semi-join between two tables A and B, find the subset of rows in A that would be used to form some row of the natural join of A and B
- A right semi-join would find the subset of instead
These are done using EXISTS(or IN, if there is only one shared attribute)
The example we saw of EXISTS was really a left semi-join of Student with lecturers
- We only kept the rows from students that had a matching first and last name as a lecturer, which were their common attributes
GROUP BY
Instead of first discussing how it is used/what it does, I will start by what GROUP BY is used for Say I want to know how many transactions Anne(i.e. e_id=1)was part of:
Say I want to do that for each employee instead of just Anne
To do that, you just replace WHERE with GROUP BY(and remove = 1)
A complication
If we include GROUP BY, then we can only include in SELECT attributes we GROUP BY or aggregates
E.g. say we wanted the first name of the employees as well as their employee id and the number of their transactions
A solution to the complication
Intution
Intuitively speaking, GROUP BY works like this:
Do the previous part of the query until GROUP BY (except ignoring the part in SELECT)
Split the resulting table into sub-tables
- 1 sub-table for each value of the variables in GROUP BY
- I.e. if you had say first_nameand e_idlike in the example, you would have a sub-table for each pair of first_nameand e_id. This means, e.g. that if two employees were named Anne, you would have two sub-tables, one for each Anne, defined by their e_id
HAVING
在 GROUP BY 之后的 WHERE
ORDER BY
Lec7- SQL Queries
UNION
取并集
合并的名字要一直
Basically, the same as in math with sets (不允许有重复元素出现)
You can also view is as putting one list after another
Say we have an Employees table
We want a list of names i.e. both first and family name
1 | SELECT first_name as name |
Things work
Things work in interesting ways
Lec8- SQL QUeries - misc. and relational algebra
Views: as saved queries
1 | SELECT first_name, e_id,COUNT(t_id) |
![p1](https://pic.imgdb.cn/item/6175a4ce2ab3f51d917d6007.png)
1 | CREATE VIEW Employee_transaction_count AS |
![p1](https://pic.imgdb.cn/item/6175a5952ab3f51d917dd9a9.png)
Views: as virtual tables
![p1](https://pic.imgdb.cn/item/6175a5952ab3f51d917dd9a9.png)
More on views
视图可以覆盖或者删除
![p1](https://pic.imgdb.cn/item/6175a6bc2ab3f51d917e7959.png)
Relation Algebra
Algebra
Algebra = branch of mathExample: Algebra with numbers
- Addition(+) = function that takes two numbers and returns a number
- Logarithm (written: log) = function that takes one number and returns another
- Has a subscript-the base
Can be composed, e.g.: log2(3 + 8) + 4
Relational algebra
Algebra with tables $\approx$ SQL SELECT queries
- Exception: uses set semantics
Relational algebra is crucial for optimization
Projectin($\pi$)
将R限制为属性列表中的属性(sets)
Renaming($\rho$)
Selection($\sigma$)
所有符合条件的束