I want to create a database with 3 main entities each of which will be subtyped further(by subtype I mean generalization-specialization).
This database will support a website where the users can ONLY Read the database. I don’t know how many users will hit the website daily.
As I see vision it, the database will have a TOTAL of 500,000 rows.
Once a week I would like to update the database with around 10 records in ALL in various entities.
Should I use InndoDB? What optimizations can I make since this will be a “mostly read” database. Is it sensible to stop the website for 5 minutes and upload the weekly 10 records? Or can that be done online?
This is an example in MariaDB. Although I have 3 entities,for the sake of a simple example I have used ONE entity Person which has subtypes.
DROP DATABASE IF EXISTS People_Database;
CREATE DATABASE People_Database;
Use People_Database;
People are classified into 2 categories – Insider(s) and Outsider(s)
Each person is EXACTLY ONE out of Insider or Outsider.
This is an example of Gen-Spec, since each of the 2 categories have some variables in common
and some specific variables. By Gen-Spec I mean Generalization-Specialization.
CREATE TABLE Person_Category(id TINYINT NOT NULL,category CHAR(10)) ENGINE=InnoDB DEFAULT CHARSET=utf8;
INSERT INTO Person_Category(id,category)
VALUES
(1,"Insider"),
(2,"Outsider");
Insiders are further classified into ONE out of 3 categories – Team-mate(s), Alumni and Advisor(s)
This is NOT a Gen-Spec pattern since ALL 3 categories have the SAME variables.
CREATE TABLE Insider_Person_Category (id TINYINT NOT NULL,
category CHAR(10) NOT NULL,
CONSTRAINT Insider_Person_Category_PK PRIMARY KEY (id)) ENGINE=InnoDB DEFAULT CHARSET=utf8;
INSERT INTO Insider_Person_Category (id, category)
VALUES
(1, "Team-mate"),
(2, "Alumni"),
(3, "Advisor");
CREATE TABLE Person(id int NOT NULL AUTO_INCREMENT,
first_name CHAR(20),
last_name CHAR(20),
category_id TINYINT NOT NULL,
CONSTRAINT Person_PK PRIMARY KEY (id),
CONSTRAINT FOREIGN_KEY (category_id) REFERENCES Person_Category(id),
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE Insider_Person(id int NOT NULL,
insider_category TINYINT NOT NULL,
Person_Image MEDIUMBLOB,
email_address VARCHAR(100),
Designation VARCHAR(100),
Biography TEXT,
CONSTRAINT Insider_Person_FK1 FOREIGN_KEY (insider_category) REFERENCES Insider_Person_Category(id),
CONSTRAINT Insider_Person_PK PRIMARY KEY(id),
CONSTRAINT Insider_Person_FK2 FOREIGN_KEY (id) REFERENCES Person(id)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE Outsider_Person (id int NOT NULL,
Person_URL TEXT,
Person_Detailed_Designation TEXT,
CONSTRAINT Outsider_Person_PK PRIMARY KEY(id),
CONSTRAINT Outsider_Person_FK FOREIGN_KEY (id) REFERENCES Person(id)) ENGINE=InnoDB DEFAULT CHARSET=utf8;
DELIMITER //
CREATE PROCEDURE Person_Database.Person_Add (id INT,first_name CHAR(20),last_name CHAR(20),category_id TINYINT,
insider_category TINYINT,
Person_Image MEDIUMBLOB,
email_address VARCHAR(100),Designation VARCHAR(100),Biography TEXT,
Person_URL TEXT,Person_Detailed_Designation TEXT)
BEGIN
START TRANSACTION;
--- Basic check before we start.
IF category_id NOT in (1,2)
ROLLBACK;
ENDIF;
--- Insert into Person
INSERT INTO Person(id,first_name,last_name,category_id) VALUES (id,first_name,last_name,category_id);
IF category_id=1
--- Insert into Insider_Person
INSERT INTO Inside_Person(id,insider_category,Person_Image,email_address,Designation,Biography)
VALUES
(id,insider_category,Person_Image,email_address,Designation,Biography);
ELSE
--- Insert INTO Outsider_Person
INSERT INTO Outside_Person(id,Person_URL,Person_Detailed_Designation)
VALUES
(id,Person_URL,Person_Detailed_Designation);
END IF;
COMMIT;
END;
//
DELIMITER ;
Notice if we add People only via the above user defined procedure, one Person will ALWAYS have EXACTLY ONE category ( Insider/ Outsider). If the Transaction fails after adding to Person it will be rolled back so that we don’t have a Person who is neither an Insider or an Outsider. Also notice how this takes care of the Shared Primary Key – (called id ( in tables Person / Insider_Person / Outsider_Person tables ) in the above example).
Query : How do I ensure that a Person will be added only via the above function and not in any other way ? I will add only 10 records each week, for the rest of the time it will be a READ ONLY database. How can I exploit this and optimize it’s behavior?
The SELECT queries for this database.
Only Insiders
Strategy : Join the Insiders with Insider Category and THEN Add on the attributes for Insiders from Person
SELECT * FROM Insider_Person as IP LEFT JOIN Insider_Person_Category as IPC ON IP.insider_category = IPC.id
LEFT JOIN Person as P ON IP.id = P.id;
Strategy : ONLY Some people are Outsiders so we should do a LEFT JOIN
Only Outsiders
SELECT * FROM Outsider_Person as OP LEFT JOIN Person as P ON OP.id = P.id;
ALL People
Strategy : We need to use 4 LEFT JOINs since Person has ALL people, only some are in Insider_Person.
Similary we may have Inside Person Categories which MAY not have corresponding Insider_Person to start with.
Also we have ALL People in Person but only some are in Outsider_Person. What if by chance we have only INSIDERS ? Then we should only do a LEFT JOIN with Person_Category.That is why we have the following :-
SELECT * FROM Person as P LEFT JOIN Insider_Person as IP ON P.id = IP.id
LEFT JOIN Insider_Person_Category as IPC ON IP.insider_category = IPC.id
LEFT JOIN Outsider_Person as OP ON P.id = OP.id
LEFT JOIN Person_Category as PC ON P.category_id=PC.id;