Info
Записки от лекции по “CSCB405 Бази данни и системи за управление на бази данни” към НБУ, водени от доц. Юлиана Пенева д-р.
Lecture 1: Data Organization and Database Basics
-
Introduction to Data Management Approaches:
- Data Management: Applying IT to convert large data volumes into meaningful information for business decision-making.
- Traditional Approach:
- File-Based Systems: Collection of files managed by individual programs, leading to data redundancy, dependency on specific applications, and potential inconsistency.
- Characteristics: Separate files for each application, individual programs for data processing, and data definitions embedded in application programs.
- Advantages: Simple to implement with minimal tools.
- Disadvantages: Data redundancy, integrity issues, data-program dependency, and limited flexibility in queries.
- Database Approach:
- DBMS: Software that manages databases, providing a centralized and integrated data storage system accessible to multiple applications.
- Characteristics: Data centralized, managed by DBMS, and accessible by all applications.
- Advantages: Data consistency, integrity, reduced redundancy, and flexibility in data handling.
-
Core Concepts:
- Data vs. Information:
- Data: Raw facts, like transactions or measurements, waiting for processing.
- Information: Processed and structured data useful for decision-making.
- Data Hierarchy:
- Symbol, Field, Record, File: Hierarchical organization from smallest to largest data units.
- Database: An organized collection of related data designed for specific uses, holding information that multiple users can access and manipulate.
- Database Users:
- End-users, Database Administrators (DBAs), Database Designers, Application Programmers.
- Data vs. Information:
-
Historical Overview of DBMS:
- Shift from file-based systems to database systems to address the shortcomings of traditional methods and improve data accessibility and integrity.
Lecture 2: Data Modeling and DBMS Architecture
-
Data Modeling:
- Definition: The process of representing a real-world application’s properties and requirements using abstract structures.
- Data Model Types:
- Physical Models: Low-level structures defining how data is physically stored.
- Logical Models: Conceptualize data independent of physical details.
- Examples: Hierarchical, Network, and Relational models.
- Object-Based Models: Conceptual models that are closer to how users perceive data.
- Examples: Entity-Relationship, Object-Oriented models.
- Key Components:
- Entities, Attributes, Relationships: Used to represent data structures and the relationships within an application.
- Constraints and Operations: Define data rules and permissible manipulations.
-
DBMS Architecture:
- Three-Level ANSI/SPARC Architecture:
- Internal (Physical) Level: Details how data is stored physically.
- Conceptual (Logical) Level: Organizes data as a collective schema, hiding storage details.
- External (View) Level: Defines different views for different user needs.
- Data Independence:
- Logical Data Independence: Change in conceptual schema without affecting the external schema.
- Physical Data Independence: Change in internal schema without affecting conceptual schema.
- DBMS Interfaces:
- DDL (Data Definition Language): Specifies database schema and constraints.
- DML (Data Manipulation Language): Defines queries for data retrieval and modification.
- Query Languages: High-level (e.g., SQL) for ease of use; procedural for specific data access patterns.
- Three-Level ANSI/SPARC Architecture:
-
Functional Components of a DBMS:
- Storage Management: Handles data storage.
- Query Processor: Analyzes and executes database queries.
- Transaction Management: Ensures ACID properties in multi-user environments.
- Concurrency Control: Manages simultaneous data access by multiple users.
Lecture 3: Relational Model Principles
-
Overview of the Relational Model:
- Origin: Developed by E.F. Codd in 1970, revolutionizing database management with a structured approach based on set theory and predicate logic.
- Main Concepts:
- Relation: Analogous to a table, it’s a set of tuples (rows) with each attribute (column) drawn from a domain.
- Attributes and Tuples: Attributes are columns; tuples are rows in the table.
- Keys:
- Superkey: A set of attributes uniquely identifying a tuple.
- Candidate Key: Minimal superkey with no unnecessary attributes.
- Primary Key: The chosen candidate key for a table.
- Foreign Key: A field in one table that uniquely identifies a row in another.
-
Relational Constraints:
- Domain Constraints: Limits on attribute values based on type or range.
- Entity Integrity: Primary keys cannot be null, ensuring each record’s uniqueness.
- Referential Integrity: Foreign keys must match primary keys in the referenced table or be null.
-
Relational Algebra:
- Set Operations: Union, intersection, and difference to combine relations.
- Relational Operations:
- Selection (σ): Extracts rows meeting criteria.
- Projection (π): Extracts specific columns.
- Join (⨝): Combines rows from two tables based on a related column.
- Division (÷): Useful for queries involving “for all” type relationships.
Lecture 4: SQL Overview
-
Introduction to SQL:
- Structured Query Language (SQL): The primary language for managing relational databases.
- Historical Background: Developed as SEQUEL for System R, it evolved to SQL, a declarative language defined by standards.
- Two Main Components:
- Data Definition Language (DDL): Commands for defining database schema (CREATE, ALTER, DROP).
- Data Manipulation Language (DML): Commands for manipulating data (SELECT, INSERT, UPDATE, DELETE).
-
Core SQL Commands:
- SELECT Statement: Used to retrieve data with various clauses.
- FROM: Specifies tables to retrieve data from.
- WHERE: Filters results based on conditions.
- GROUP BY: Aggregates data by grouping based on a column.
- HAVING: Sets conditions on grouped data.
- ORDER BY: Sorts results by specified columns.
- INSERT, UPDATE, DELETE: Commands for adding, modifying, and removing data.
- SELECT Statement: Used to retrieve data with various clauses.
-
Advanced SQL Features:
- Aggregate Functions: Functions like COUNT, SUM, AVG, MIN, and MAX to perform calculations on groups of data.
- Nested Queries: Sub-queries used within WHERE clauses or as part of larger queries for complex data retrieval.
- UNION, INTERSECT, DIFFERENCE: Set operations to combine query results.
- NULL Handling: SQL includes specific operations and functions to handle NULL values (e.g., IS NULL, COALESCE).
-
SQL Standards and Extensions:
- Evolution of SQL Standards:
- SQL2 (SQL-92): Extended SQL capabilities.
- SQL3: Introduced object-oriented features.
- Further Versions: SQL:2003, SQL:2008, SQL:2011, SQL:2016, each adding functionality like XML support, JSON, and advanced analytics.
- SQL’s Role in Data Management: It is central to relational DBMS due to its versatility, allowing data definition, manipulation, and control all in one language.
- Evolution of SQL Standards: