Glossary of Database and SQL Terms

Last modified: April 08, 2024

This article is written in: 🇺🇸

Glossary of Database and SQL Terms

Database: A collection of organized data for easy access, management, and updating.
Table: A structure with rows and columns for storing data in a database.
Row (Record): A single entry in a table with data.
Column (Field): A category of data within a table.
Primary Key: A unique identifier for each row in a table.
Foreign Key: A key that connects one table to another by referring to the primary key of the other table.
Index: A tool that speeds up data retrieval in a database.
Query: A request to access or modify data in a database.
SQL (Structured Query Language): A language for working with relational databases.
SELECT: An SQL command for getting data from a table.
INSERT: An SQL command for adding new data to a table.
UPDATE: An SQL command for changing existing data in a table.
DELETE: An SQL command for removing data from a table.
JOIN: An SQL operation that combines data from multiple tables based on shared columns.
WHERE: An SQL keyword for filtering data based on specific conditions.
GROUP BY: An SQL keyword for grouping rows with the same values in specified columns.
ORDER BY: An SQL keyword for sorting results based on certain columns.
Schema: The structure of a database, including tables, columns, and relationships.
ACID (Atomicity, Consistency, Isolation, Durability): Features that ensure database transactions are reliable.
RDBMS (Relational Database Management System): A system for managing relational databases using SQL.
Constraint: A rule for table columns to keep data accurate and consistent.
UNIQUE: A constraint that makes sure all values in a column are different.
NOT NULL: A constraint that requires a column to have a value.
Check: A constraint that forces all column values to meet a certain condition.
Index: A database object that improves the speed of data retrieval within a table.
View: A virtual table created from the results of an SQL query.
Alias: A temporary name given to a table or column in an SQL query for easier reference.
TRANSACTION: A group of SQL operations executed as a single task.
COMMIT: An SQL command for saving changes made by a transaction.
ROLLBACK: An SQL command for undoing changes made by a transaction.
TRIGGER: A stored procedure that runs automatically when an event (INSERT, UPDATE, DELETE) occurs in a table.
Stored procedure: A saved set of SQL statements in a database.
Function: A set of SQL statements with a name, input parameters, actions, and a result.
Normalization: A method for organizing data in a database to reduce redundancy and improve data integrity.
Denormalization: A process of adding redundant data to a database to speed up query performance.
DDL (Data Definition Language): A part of SQL for creating and modifying database objects like tables and indexes.
DML (Data Manipulation Language): A part of SQL for working with data in a database, including SELECT, INSERT, UPDATE, and DELETE.
DCL (Data Control Language): A part of SQL for managing user access and permissions, such as GRANT and REVOKE.
TCL (Transaction Control Language): A part of SQL for handling transactions, including COMMIT and ROLLBACK.
NULL: A special marker in SQL that indicates a data value is missing or unknown in the database.
NoSQL: A class of non-relational databases designed for handling various types of data, often providing better scalability and flexibility than traditional relational databases.
CAP Theorem: A principle stating that it is impossible for a distributed data store to simultaneously provide consistency, availability, and partition tolerance.
Sharding: The process of splitting a large database into smaller, more manageable pieces, often improving performance and scalability.
Partitioning: The practice of dividing a table into smaller, more manageable pieces based on a specific column or set of columns.
Replication: The process of copying and maintaining the same data on multiple database nodes to increase availability and fault tolerance.
BASE (Basically Available, Soft State, Eventual Consistency): A set of attributes that describe the behavior of some distributed systems, providing a more relaxed approach to consistency compared to ACID properties.
Graph Database: A type of NoSQL database that stores data as nodes and edges in a graph, optimized for querying and traversing relationships between data points.
Amazon RDS: A managed relational database service provided by Amazon Web Services (AWS), offering support for multiple database engines, including MySQL, PostgreSQL, and Oracle.
Amazon DynamoDB: A managed NoSQL database service provided by AWS, designed for high availability, scalability, and low latency.
Amazon Aurora: A managed relational database service provided by AWS, offering compatibility with MySQL and PostgreSQL and improved performance, availability, and scalability.
Caching: Temporary storage of query results or intermediate data to speed up subsequent query executions.
Horizontal Scaling: The practice of adding more nodes to a system to handle increased workload, often used in distributed systems to improve performance and availability.
Vertical Scaling: The practice of adding more resources, such as CPU or memory, to a single node to handle increased workload.
In-Memory Database: A type of database that stores data in the main memory instead of on disk, providing faster data access and processing times.
SQL Injection: A security vulnerability that occurs when an attacker is able to insert malicious SQL code into a query, potentially compromising the database or exposing sensitive data.
ETL (Extract, Transform, Load): A process used to collect, clean, and move data from one or more sources to a data warehouse or another data store.
OLTP (Online Transaction Processing): A class of systems designed for managing transactional workloads, such as inserting, updating, and deleting records.
OLAP (Online Analytical Processing): A class of systems designed for managing analytical workloads, such as complex queries and aggregations.
Data Warehousing: A large-scale data storage solution optimized for storing, managing, and analyzing large amounts of historical data from various sources.
Big Data: A term referring to the massive volume, variety, and velocity of data generated by modern applications and devices, often requiring specialized tools and techniques for processing and analysis.
Hadoop: An open-source framework for distributed storage and processing of large datasets using the MapReduce programming model.
MapReduce: A programming model for processing and generating large data sets in parallel across a distributed computing environment.
Apache Spark: An open-source distributed data processing engine designed for high-performance, large-scale data processing and machine learning tasks.
Apache Cassandra: A highly scalable, distributed NoSQL database designed for handling large amounts of data across many nodes, providing high availability and fault tolerance.
Elasticsearch: An open-source, distributed search and analytics engine built on Apache Lucene, used for indexing and searching large volumes of data.