When you create a database, it usually with a few core goals in mind: to store data in an easy manner, and to be able to easily access or manipulate it. For the latter, this could for example be to insert new data into the database or delete useless and outdated data. However, one of the challenges that arise when we want to modify our database — that we may violate the integrity constraints of the database. To avoid violating these constraints, it is possible to use multiple methods. In this article, we will discuss the methods called ‘constraints’, ‘assertions’, and ‘triggers.
Presenting the Integrity Constraints
Before we dive into the three methods stated above, we should first consider the concept of ‘integrity constraints’ a little bit. Integrity constraints are a set of rules, and they are used to maintain the quality of the information in our database. The integrity constraints assure that when we perform, for example, data insertion or deletion, the data integrity is not violated. In short, integrity constraints are put in place to protect the database against accidental damage. There exist multiple types of integrity constraints, all of which can be seen in the visualization below:
We can try to go through the different types to understand what they mean:
The domain constraint can be defined as the definition of a valid set of values for a given attribute. For example, every domain has a given data type. If the data we want to insert does not fall into this data type, it should not be allowed into the database. We can take a simple example:
We can see that we only accept integers as a valid data type for the domain, ‘age’. Hence, if we were to insert a string, then it would violate the integrity of the domain. We can then say that we have a domain constraint since we only accept values that do not violate this integrity.
Another simple example could be if we have a database of which months a person was born in — then we might store it either as strings or integers, but we can a few values of these data types. We could not accept an integer like ‘17’ since it does not correspond to any month.
Entity Integrity Constraint
The entity integrity constraint states that the primary key value can’t be null, since we use the primary key to uniquely identify rows. Hence, if the primary key is null — then there is a row we cannot uniquely identify through the primary key, which defeats the purpose. We can take an example:
We can see that we have a null value for our primary key, the ID. Hence, we cannot use it to find the row containing the information about Bobby. This violates the Entity Integrity Constraint; hence we would need to make sure that our primary key does not contain null values.
Referential Integrity Constraint
Imagine that we have two tables, which are connected through a foreign key. If a foreign key in table 1 refers to the Primary Key of table 2, then every value of the Foreign Key in table 1 must be either null or in table 2. Otherwise, there is a reference in table 1 to nothing — which is an inconsistency. We can take an illustrative example:
We can see that we have a violation in our first row in table 1 since we have no primary key in table 2 corresponding to the value of ‘11’.
When we have a key to our table, not only a primary key, then it must uniquely identify a row. This is called the key constraint since keys per definition are how we identify rows — hence, if we have a non-unique key, then it is no longer a key. We can take an illustrative example:
We can see that our key, also our primary key, in this case, has a duplicate. This means our key is no longer unique, and it violates the key constraint. We would need to change the ID for Bobby or Gavin to no longer violate the constraint.
Constraints in SQL
We have now presented the constraints above on a theoretical level, but we can now try to consider how they could be implemented through SQL. We can take them in the same order, as presented above.
As we stated above, a domain constraint means that the piece of data we want to insert into our table has a correct data type or an acceptable value. We can take the example again, where we want to insert ‘Bobby’ into our database. We then need to not only check the data type but also the value — for example, he cannot have a negative age! The question is how this could be done through SQL. Let us see an example of how it could be done when we create our table:
We can see that we have created a constraint on our table, People. It will be used every time we want to insert something into our table since it will check whether the ‘age’ we insert is above 0 through the ‘check’ statement. We can also see that we cannot insert string-values, since we have specified that ‘age’ should be an integer.
Is it possible to make a constraint on multiple columns? The answer is again, yes. Let us now imagine that we have the following database table:
We have simply added a new attribute, which stores the gender of the people in our database. Then clearly, we would need to make sure that each time we insert into our database that people are either stored as ‘M’ or ‘F’. We could then make a new create table statement:
We have now made a check statement across multiple columns, also called a ‘tuple-based’ check constraint. We now check both that the age is non-negative and that the gender falls in either ‘F’ or ‘M’.
Here, we considered the fact that primary keys are not allowed to be null. This constraint is very easy to implement in SQL, and we can already see an example in the SQL statements above. Let us consider it again:
This time, consider the line: ‘ID int primary key’. We simply specify that the ID is our primary key, and then the database makes sure that this is not violated when we insert something into it. It is a mix between two constraints: ‘not null’ and ‘unique’ since we can accept neither for a primary key.
When we want to implement a foreign key in our SQL, then we simply need to specify it. Imagine that we first make a table, which keeps track of departments through their Department Number and building description. It could look like so:
We have now specified that our attribute, DepNo, is a foreign key and the database must respect the constraint inherent in it when we insert or delete something from our database.
The question is, how can we then, for example, delete from our database? Because a value cannot be deleted from the ‘master’ table if the value is lying in the ‘child’ Table. Let us consider the following table:
Imagine we want to delete the row with ‘ID = 2’, but we have a reference for our DepNo which lies in our Departments table. We cannot, since we cannot delete from the master table if a value lies in the child table. To solve this, we would need to delete on cascade — meaning if we delete one place, then we also delete in the other place automatically. The create table statement would then look like so:
Now, every time we delete in People — then we also delete in Departments.
Here, we simply specified that a key must be both unique and not null. Hence, we can use the constraint statements ‘not null’ and ‘unique’ when we specify a key.
Assertions are different from check constraints in the way that check constraints are rules that relate to one single row only. Assertions, on the other hand, can involve any number of other tables, or any number of other rows in the same table. Assertions also check a condition, which must return a Boolean value. We can take an illustrative example.
Let us imagine that we have the following table, which contains employees in a company — we then also store an attribute containing their salary.
We then want to make an assertion that there is no employee in our database, which is paid more than 30.000$ or less than 500$. It would then look like:
Then it makes sure that we never have someone who receives a salary outside these bounds.
We can now consider the concept of triggers. Triggers are sometimes called event-condition-action rules. This is since triggers are only active in certain scenarios. We can describe the process of a trigger:
- Triggers are only awakened when certain events occur. This is usually the event of ‘delete’, ‘insert’, or ‘update’.
- When the trigger is awakened, the trigger tests a condition. If the condition does not hold, then nothing else associated with the trigger happens in response to the given event. On the other hand, if the trigger is satisfied then a pre-defined action is performed by the trigger.
We can take an example of a trigger. Let us imagine that we have the following database:
We already have a check constraint, where we check the salary. It could, however, also be written as a trigger. We could write a trigger that is awakened every time we want to insert something into our database, which then checks a condition upon the salary:
We can now see that our trigger is awakened every time we want to insert something into our database and it is executed before it is inserted — this is since we need to check the condition before we are willing to insert. It is also possible to create triggers that are executed after, or even before and after.
We have now seen the different cases of integrity in databases and how they can be maintained through the three methods presented: constraints, assertions, and triggers. It is important to consider these things when creating a database since it is essential to make sure the database is consistent and respects the inherent integrity — otherwise, you might harm it and render it less useful.