This document provides an organized and comprehensive overview of surveilr's RSSD migration process starting from v 1.0.0, breaking down each component and the steps followed to ensure smooth and efficient migrations. It covers the creation of key tables and views, the handling of migration cells, and the sequence for executing migration scripts.


Session and State Initialization

To manage temporary session data and track user state, we use the session_state_ephemeral table, which stores essential session information like the current user. This table is temporary, meaning it only persists data for the duration of the session, and it's especially useful for identifying the user responsible for specific actions during the migration.

Each time the migration process runs, we initialize session data in this table, ensuring all necessary information is available without affecting the core database tables. This initialization prepares the system for more advanced operations that rely on knowing the user executing each action.


Assurance Schema Table

The assurance_schema table is designed to store various schema-related details, including the type of schema assurance, associated codes, and related governance data. This table is central to defining the structure of assurance records, which are useful for validating data, tracking governance requirements, and recording creation timestamps. All updates to the schema are logged to track when they were last modified and by whom.


Code Notebook Kernel, Cell, and State Tables

surveilr uses a structured system of code notebooks to store and execute SQL commands. These commands, or “cells,” are grouped into notebooks, and each notebook is associated with a kernel, which provides metadata about the notebook's language and structure. The main tables involved here are:

  • code_notebook_kernel: Stores information about different kernels, each representing a unique execution environment or language.
  • code_notebook_cell: Holds individual code cells within each notebook, along with their associated metadata and execution history.
  • code_notebook_state: Tracks each cell's state changes, such as when it was last executed and any errors encountered.

By organizing migration scripts into cells and notebooks, surveilr can maintain detailed control over execution order and track the state of each cell individually. This tracking is essential for handling updates, as it allows us to execute migrations only when necessary.


Views for Managing Cell Versions and Migrations

Several views are defined to simplify and organize the migration process by managing different versions of code cells and identifying migration candidates. These views help filter, sort, and retrieve the cells that need execution.

Key Views

  • code_notebook_cell_versions: Lists all available versions of each cell, allowing the migration tool to retrieve older versions if needed for rollback or auditing.
  • code_notebook_cell_latest: Shows only the latest version of each cell, simplifying the migration by focusing on the most recent updates.
  • code_notebook_sql_cell_migratable: Filters cells to include only those that are eligible for migration, ensuring that non-executable cells are ignored.

Migration-Oriented Views and Dynamic Migration Scripts

To streamline the migration process, several migration-oriented views organize the data by listing cells that require execution or are ready for re-execution. By grouping these cells in specific views, surveilr dynamically generates a script that executes only the necessary cells.

Key Views

  • code_notebook_sql_cell_migratable_not_executed: Lists migratable cells that haven’t yet been executed.
  • code_notebook_sql_cell_migratable_state: Shows the latest migratable cells, along with their current state transitions.

How Migrations Are Executed

When it's time to apply changes to the database, this section explains the process in detail, focusing on how surveilr prepares the environment, identifies which cells to migrate, executes the appropriate SQL code, and ensures data integrity throughout.


1. Initialization

The first step in the migration process involves setting up the essential database tables and seeding initial values. This lays the foundation for the migration process, making sure that all tables, views, and temporary values needed are in place.

  • Check for Core Tables: surveilr first verifies whether the required tables, such as code_notebook_cell, code_notebook_state, and others starting with code_notebook%, are already set up in the database.
  • Setup: If these tables do not yet exist, surveilr automatically initiates the setup by running the initial SQL script, known as bootstrap.sql. This script contains SQL commands that create all the essential tables and views discussed in previous sections.
  • Seeding: During the execution of bootstrap.sql, essential data, such as temporary values in the session_state_ephemeral table (e.g., information about the current user), is also added to ensure that the migration session has the data it needs to proceed smoothly.

2. Migration Preparation and Identification of Cells to Execute

Once the environment is ready, surveilr examines which specific cells (code blocks in the migration notebook) need to be executed to bring the database up to the latest version.

  • Listing Eligible Cells: surveilr begins by consulting views such as code_notebook_sql_cell_migratable_not_executed. This view is a pre-filtered list of cells that are eligible for migration but haven’t yet been executed.
  • Idempotent vs. Non-Idempotent Cells: surveilr then checks whether each cell is marked as “idempotent” or “non-idempotent.”
    • Idempotent Cells can be executed multiple times without adverse effects. If they have been run before, they can safely be run again without impacting data integrity.
    • Non-Idempotent Cells, identified by names containing _once_, should only be executed once. If these cells have been executed previously, they are skipped in the migration process to prevent unintentional re-runs.

3. Dynamic Script Generation and Execution

surveilr then assembles a custom SQL script that includes only the cells identified as eligible for execution. This script is crafted carefully to ensure each cell's SQL code is executed in the correct order and with the right contextual information.

  • Script Creation: We start by generating a dynamic script in a single transaction block. Transactions are a way of grouping a series of commands so that they are either all applied or none are, which protects data integrity.
  • Inclusion of Cells Based on Eligibility:
    • For each cell, surveilr checks its eligibility status. If it's non-idempotent and already executed, it's marked with a comment noting that it's excluded from the script due to previous execution.
    • If the cell is idempotent or eligible for re-execution, its SQL code is added to the script, along with additional details such as comments about the cell's last execution date.
  • State Transition Records: After each cell's SQL code, additional commands are added to record the cell's transition state. This step inserts information into code_notebook_state, logging details such as the cell ID, transition state (from “Pending” to “Executed”), and the reason for the transition (“Migration” or “Reapplication”). These logs are invaluable for auditing purposes.

4. Execution in a Transactional Block

With the script prepared, surveilr then executes the entire batch of SQL commands within a transactional block.

  • BEGIN TRANSACTION: The script begins with a transaction, ensuring that all changes are applied as a single, atomic unit.
  • Running Cell Code: Within this transaction, each cell's SQL code is executed in the order it appears in the script.
  • Error Handling: If any step in the transaction fails, all changes are rolled back. This prevents partial updates from occurring, ensuring that the database remains in a consistent state.
  • COMMIT: If the script executes successfully without errors, the transaction is committed, finalizing the changes. The COMMIT command signifies the end of the migration session, making all updates permanent.

5. Finalizing Migration and Recording Results

After a successful migration session, surveilr concludes by recording details about the migration process.

  • Final Updates to code_notebook_state: Any cells marked as “Executed” are updated in code_notebook_state with the latest timestamp, indicating their successful migration.
  • Logging Completion: Activity logs are updated with relevant details, ensuring a clear record of the migration.
  • Cleanup of Temporary Data: Finally, temporary data is cleared, such as entries in session_state_ephemeral, since these values were only needed during the migration process.

Pending Migrations

code_notebook_sql_cell_migratable_not_executed lists all cells eligible for migration but not yet executed. If migrations have been completed successfully, this list will be empty, indicating that all migratable cells have been processed and marked as executed.

01JDKQ54KX969861TTP9DGWZ27 ConstructionSqlNotebook session_ephemeral_table 1 2024-12-04 11:17:11
01JDKQ54KXFTHEHK3NJ3S887RQ ConstructionSqlNotebook v001_seedDML 1 2024-12-04 11:17:11

State of Executed Migrations

code_notebook_sql_cell_migratable_state displays all cells that have been successfully executed as part of the migration process, showing the latest version of each migratable cell. For each cell, it provides details on its transition states, the reason and result of the migration, and the timestamp of when the migration occurred.

01JDKQ54KWXXK7Z9TK42PS4XDT ConstructionSqlNotebook v001_once_initialDDL 0 2024-12-04 11:17:11 NONE EXECUTED execute_migrations 2024-12-04 11:17:11

Executable Migrations

All cells that are candidates for migration (including duplicates)

01JDKQ54KX969861TTP9DGWZ27 ConstructionSqlNotebook session_ephemeral_table

session_ephemeral_table

b739acd000cf37091bbb365085506f975345351d 1 2024-12-04 11:17:11
01JDKQ54KWXXK7Z9TK42PS4XDT ConstructionSqlNotebook v001_once_initialDDL

v001_once_initialDDL

da8557bdf6fe89d9e489e0b498ba136053b4bf3a 0 2024-12-04 11:17:11
01JDKQ54KXFTHEHK3NJ3S887RQ ConstructionSqlNotebook v001_seedDML

v001_seedDML

7d8f175c6d1e0dd76fb695437d35d4e49e1101b8 1 2024-12-04 11:17:11