I want to create a database with 3 main entities each of which will be subtyped further(by subtype I mean generalization-specialization).
This database will support a website where the users can ONLY Read the database. I don’t know how many users will hit the website daily.
As I see vision it, the database will have a TOTAL of 500,000 rows.
Once a week I would like to update the database with around 10 records in ALL in various entities.
Should I use InndoDB? What optimizations can I make since this will be a “mostly read” database. Is it sensible to stop the website for 5 minutes and upload the weekly 10 records? Or can that be done online?
This is an example in MariaDB. Although I have 3 entities,for the sake of a simple example I have used ONE entity Person which has subtypes.
DROP DATABASE IF EXISTS People_Database; CREATE DATABASE People_Database; Use People_Database;
People are classified into 2 categories – Insider(s) and Outsider(s)
Each person is EXACTLY ONE out of Insider or Outsider.
This is an example of Gen-Spec, since each of the 2 categories have some variables in common
and some specific variables. By Gen-Spec I mean Generalization-Specialization.
CREATE TABLE Person_Category(id TINYINT NOT NULL,category CHAR(10)) ENGINE=InnoDB DEFAULT CHARSET=utf8; INSERT INTO Person_Category(id,category) VALUES (1,"Insider"), (2,"Outsider");
Insiders are further classified into ONE out of 3 categories – Team-mate(s), Alumni and Advisor(s)
This is NOT a Gen-Spec pattern since ALL 3 categories have the SAME variables.
CREATE TABLE Insider_Person_Category (id TINYINT NOT NULL, category CHAR(10) NOT NULL, CONSTRAINT Insider_Person_Category_PK PRIMARY KEY (id)) ENGINE=InnoDB DEFAULT CHARSET=utf8; INSERT INTO Insider_Person_Category (id, category) VALUES (1, "Team-mate"), (2, "Alumni"), (3, "Advisor"); CREATE TABLE Person(id int NOT NULL AUTO_INCREMENT, first_name CHAR(20), last_name CHAR(20), category_id TINYINT NOT NULL, CONSTRAINT Person_PK PRIMARY KEY (id), CONSTRAINT FOREIGN_KEY (category_id) REFERENCES Person_Category(id), ) ENGINE=InnoDB DEFAULT CHARSET=utf8; CREATE TABLE Insider_Person(id int NOT NULL, insider_category TINYINT NOT NULL, Person_Image MEDIUMBLOB, email_address VARCHAR(100), Designation VARCHAR(100), Biography TEXT, CONSTRAINT Insider_Person_FK1 FOREIGN_KEY (insider_category) REFERENCES Insider_Person_Category(id), CONSTRAINT Insider_Person_PK PRIMARY KEY(id), CONSTRAINT Insider_Person_FK2 FOREIGN_KEY (id) REFERENCES Person(id) ) ENGINE=InnoDB DEFAULT CHARSET=utf8; CREATE TABLE Outsider_Person (id int NOT NULL, Person_URL TEXT, Person_Detailed_Designation TEXT, CONSTRAINT Outsider_Person_PK PRIMARY KEY(id), CONSTRAINT Outsider_Person_FK FOREIGN_KEY (id) REFERENCES Person(id)) ENGINE=InnoDB DEFAULT CHARSET=utf8; DELIMITER // CREATE PROCEDURE Person_Database.Person_Add (id INT,first_name CHAR(20),last_name CHAR(20),category_id TINYINT, insider_category TINYINT, Person_Image MEDIUMBLOB, email_address VARCHAR(100),Designation VARCHAR(100),Biography TEXT, Person_URL TEXT,Person_Detailed_Designation TEXT) BEGIN START TRANSACTION; --- Basic check before we start. IF category_id NOT in (1,2) ROLLBACK; ENDIF; --- Insert into Person INSERT INTO Person(id,first_name,last_name,category_id) VALUES (id,first_name,last_name,category_id); IF category_id=1 --- Insert into Insider_Person INSERT INTO Inside_Person(id,insider_category,Person_Image,email_address,Designation,Biography) VALUES (id,insider_category,Person_Image,email_address,Designation,Biography); ELSE --- Insert INTO Outsider_Person INSERT INTO Outside_Person(id,Person_URL,Person_Detailed_Designation) VALUES (id,Person_URL,Person_Detailed_Designation); END IF; COMMIT; END; // DELIMITER ;
Notice if we add People only via the above user defined procedure, one Person will ALWAYS have EXACTLY ONE category ( Insider/ Outsider). If the Transaction fails after adding to Person it will be rolled back so that we don’t have a Person who is neither an Insider or an Outsider. Also notice how this takes care of the Shared Primary Key – (called id ( in tables Person / Insider_Person / Outsider_Person tables ) in the above example).
Query : How do I ensure that a Person will be added only via the above function and not in any other way ? I will add only 10 records each week, for the rest of the time it will be a READ ONLY database. How can I exploit this and optimize it’s behavior?
The SELECT queries for this database.
Strategy : Join the Insiders with Insider Category and THEN Add on the attributes for Insiders from Person
SELECT * FROM Insider_Person as IP LEFT JOIN Insider_Person_Category as IPC ON IP.insider_category = IPC.id LEFT JOIN Person as P ON IP.id = P.id;
Strategy : ONLY Some people are Outsiders so we should do a LEFT JOIN
SELECT * FROM Outsider_Person as OP LEFT JOIN Person as P ON OP.id = P.id;
Strategy : We need to use 4 LEFT JOINs since Person has ALL people, only some are in Insider_Person.
Similary we may have Inside Person Categories which MAY not have corresponding Insider_Person to start with.
Also we have ALL People in Person but only some are in Outsider_Person. What if by chance we have only INSIDERS ? Then we should only do a LEFT JOIN with Person_Category.That is why we have the following :-
SELECT * FROM Person as P LEFT JOIN Insider_Person as IP ON P.id = IP.id LEFT JOIN Insider_Person_Category as IPC ON IP.insider_category = IPC.id LEFT JOIN Outsider_Person as OP ON P.id = OP.id LEFT JOIN Person_Category as PC ON P.category_id=PC.id;