Last modified: November 26, 2024
This article is written in: 🇺🇸
Glossary of Database and SQL Terms
- Database: A collection of organized data for easy access, management, and updating.
- Table: A structure with rows and columns for storing data in a database.
- Row (Record): A single entry in a table with data.
- Column (Field): A category of data within a table.
- Primary Key: A unique identifier for each row in a table.
- Foreign Key: A key that connects one table to another by referring to the primary key of the other table.
- Index: A tool that speeds up data retrieval in a database.
- Query: A request to access or modify data in a database.
- SQL (Structured Query Language): A language for working with relational databases.
- SELECT: An SQL command for getting data from a table.
- INSERT: An SQL command for adding new data to a table.
- UPDATE: An SQL command for changing existing data in a table.
- DELETE: An SQL command for removing data from a table.
- JOIN: An SQL operation that combines data from multiple tables based on shared columns.
- WHERE: An SQL keyword for filtering data based on specific conditions.
- GROUP BY: An SQL keyword for grouping rows with the same values in specified columns.
- ORDER BY: An SQL keyword for sorting results based on certain columns.
- Schema: The structure of a database, including tables, columns, and relationships.
- ACID (Atomicity, Consistency, Isolation, Durability): Features that ensure database transactions are reliable.
- RDBMS (Relational Database Management System): A system for managing relational databases using SQL.
- Constraint: A rule for table columns to keep data accurate and consistent.
- UNIQUE: A constraint that makes sure all values in a column are different.
- NOT NULL: A constraint that requires a column to have a value.
- Check: A constraint that forces all column values to meet a certain condition.
- Index: A database object that improves the speed of data retrieval within a table.
- View: A virtual table created from the results of an SQL query.
- Alias: A temporary name given to a table or column in an SQL query for easier reference.
- TRANSACTION: A group of SQL operations executed as a single task.
- COMMIT: An SQL command for saving changes made by a transaction.
- ROLLBACK: An SQL command for undoing changes made by a transaction.
- TRIGGER: A stored procedure that runs automatically when an event (INSERT, UPDATE, DELETE) occurs in a table.
- Stored procedure: A saved set of SQL statements in a database.
- Function: A set of SQL statements with a name, input parameters, actions, and a result.
- Normalization: A method for organizing data in a database to reduce redundancy and improve data integrity.
- Denormalization: A process of adding redundant data to a database to speed up query performance.
- DDL (Data Definition Language): A part of SQL for creating and modifying database objects like tables and indexes.
- DML (Data Manipulation Language): A part of SQL for working with data in a database, including SELECT, INSERT, UPDATE, and DELETE.
- DCL (Data Control Language): A part of SQL for managing user access and permissions, such as GRANT and REVOKE.
- TCL (Transaction Control Language): A part of SQL for handling transactions, including COMMIT and ROLLBACK.
- NULL: A special marker in SQL that indicates a data value is missing or unknown in the database.
- NoSQL: A class of non-relational databases designed for handling various types of data, often providing better scalability and flexibility than traditional relational databases.
- CAP Theorem: A principle stating that it is impossible for a distributed data store to simultaneously provide consistency, availability, and partition tolerance.
- Sharding: The process of splitting a large database into smaller, more manageable pieces, often improving performance and scalability.
- Partitioning: The practice of dividing a table into smaller, more manageable pieces based on a specific column or set of columns.
- Replication: The process of copying and maintaining the same data on multiple database nodes to increase availability and fault tolerance.
- BASE (Basically Available, Soft State, Eventual Consistency): A set of attributes that describe the behavior of some distributed systems, providing a more relaxed approach to consistency compared to ACID properties.
- Graph Database: A type of NoSQL database that stores data as nodes and edges in a graph, optimized for querying and traversing relationships between data points.
- Amazon RDS: A managed relational database service provided by Amazon Web Services (AWS), offering support for multiple database engines, including MySQL, PostgreSQL, and Oracle.
- Amazon DynamoDB: A managed NoSQL database service provided by AWS, designed for high availability, scalability, and low latency.
- Amazon Aurora: A managed relational database service provided by AWS, offering compatibility with MySQL and PostgreSQL and improved performance, availability, and scalability.
- Caching: Temporary storage of query results or intermediate data to speed up subsequent query executions.
- Horizontal Scaling: The practice of adding more nodes to a system to handle increased workload, often used in distributed systems to improve performance and availability.
- Vertical Scaling: The practice of adding more resources, such as CPU or memory, to a single node to handle increased workload.
- In-Memory Database: A type of database that stores data in the main memory instead of on disk, providing faster data access and processing times.
- SQL Injection: A security vulnerability that occurs when an attacker is able to insert malicious SQL code into a query, potentially compromising the database or exposing sensitive data.
- ETL (Extract, Transform, Load): A process used to collect, clean, and move data from one or more sources to a data warehouse or another data store.
- OLTP (Online Transaction Processing): A class of systems designed for managing transactional workloads, such as inserting, updating, and deleting records.
- OLAP (Online Analytical Processing): A class of systems designed for managing analytical workloads, such as complex queries and aggregations.
- Data Warehousing: A large-scale data storage solution optimized for storing, managing, and analyzing large amounts of historical data from various sources.
- Big Data: A term referring to the massive volume, variety, and velocity of data generated by modern applications and devices, often requiring specialized tools and techniques for processing and analysis.
- Hadoop: An open-source framework for distributed storage and processing of large datasets using the MapReduce programming model.
- MapReduce: A programming model for processing and generating large data sets in parallel across a distributed computing environment.
- Apache Spark: An open-source distributed data processing engine designed for high-performance, large-scale data processing and machine learning tasks.
- Apache Cassandra: A highly scalable, distributed NoSQL database designed for handling large amounts of data across many nodes, providing high availability and fault tolerance.
- Elasticsearch: An open-source, distributed search and analytics engine built on Apache Lucene, used for indexing and searching large volumes of data.