Data Purge Procedures

Follow

This guide provides detailed instructions for purging data from your system. Whether you are clearing outdated data, focusing on specific applications, or performing batch deletions, this guide will help you with the necessary steps and scripts to complete the process successfully.

Getting Started

Before starting the data purging process, it's important to understand the preliminary steps required. This section introduces the key concepts and actions you need to take before initiating any data purge. A clear understanding of these steps will help you execute the process and avoid potential issues.

Prerequisites for Purging Data from the Dashboard

Before purging any data, it’s essential to review what will be deleted to prevent accidental loss of critical information. You can follow the steps below, to make sure no data is lost accidentally.

Checking Pre-delete data from the Dashboard

  1. Access the Management Section - Navigate to the Management > DB Viewer section within your dashboard, where you can view and manage your data.
  2. Understanding Data Deletion Scope:
  • Scheduled Scripts to delete data based on Timestamp - These scripts delete data by setting a TTL (Time To Live) index on the `cd` (creation date) field of collections containing drill data. This method schedules data deletion based on when it was added to the database.
  • Scripts to delete data immediately based on Timestamp - Unlike the others, this script deletes data immediately based on the `ts` (timestamp) field, which reflects when the event occurred on the device.
  1. Recheck Script Parameters - Before running any script, review the parameters to ensure they match your data retention needs. For example, the EXPIRE_AFTER parameter controls the TTL index:

EXPIRE_AFTER = 60 * 60 * 24 * 365; // 1 year in seconds

Adjust this value if you need a different retention period.

  1. Running the Script:
  • Use the following command to execute the script:

mongosh --file < mongo_expireData.js>

  • Ensure that the connection is properly set up, typically with:
var authDB = getDB('admin');
authDB.auth('username', 'password');
  1. Monitoring the Pre-Delete Status:
  • Initially, there won’t be any visible changes. The database will start building the TTL indexes, which will increase resource usage.
  • You can verify the TTL index by checking the collection indexes. There should be an index on the `cd` field with an expireAfterSeconds value, indicating that the TTL index is active.

The index should have similar fields as shown below:

{
"v": 2,
"key": {
"cd": 1
},
"name": "cd_1",
"expireAfterSeconds": 31536000
}

Checking Post-Delete Data from the Dashboard

After the data purge, it’s important to verify that the deletion was executed correctly:

  1. Return to the Management Section - Go back to the Management > DB Viewer section to review the updated data.
  2. Post-Delete Behavior:
  • For scheduled scripts to delete data based on timestamp, data will be deleted gradually as the TTL index triggers deletion for items older than the specified expireAfterSeconds value. For instance, if the script was run on July 1st, data older than one year will be deleted starting from August 1st. 
  • Script which deletes data immediately based on timestamp, deletes data immediately based on the `ts` field. Once the script runs, the specified data will no longer be present in the drill database.
  1. Verification - Ensure that the TTL index is functioning by checking if old data is being removed as expected. For scripts that delete data immediately based on timestamp, confirm that the targeted data has been fully purged.

Verifying Script Execution

To ensure the data purge script has been executed properly, follow these steps:

  1. Check Script Execution Status - Review the execution log to see if the script ran without errors. The log should provide details about the execution, including any issues encountered.
  2. Manual Script Execution (if necessary) - If the script did not execute or encountered errors, you might need to run it manually using the command provided earlier.
  3. Monitor Database Resource Usage - After running the script, you should notice increased resource usage as the database processes the TTL indexes. This is normal and indicates that the script is working correctly.

Purging Data

The data purging process involves running specific scripts based on your needs. Below are the various options available depending on the scope of data you wish to purge:

Overall Data Purge

Use this script to purge all data from your system. It sets TTL indexes based on the `cd` field, deleting data as it reaches the specified age.

Overall Data Purge Script

Single App Data Purge

To remove data from a single application, use this script. It functions similarly to the overall purge script but targets data from one specific app.

Single App Purge Script

Multiple App Data Purge

For purging data from multiple applications simultaneously, use this script. It also sets TTL indexes based on the `cd` field across multiple app collections.

Multiple App Purge Script

Purging Data in Batches

If you need to purge large amounts of data without overloading your server, consider using this script. It deletes data immediately, based on the `ts` field, and must be rerun as needed to handle new data.

Batch Purge Script

By following these procedures, you can ensure that your data purging process is thorough, efficient, and free of errors. Regularly monitor the results and adjust the scripts as necessary to meet your organization's data retention policies.

Looking for help?