Migration Guide to version 25.03

The latest version, Countly 25.03, has been released, and this migration guide will help you navigate the upgrade process smoothly. This document outlines the important changes, benefits, and steps to ensure a smooth transition to the latest version.

Before Upgrading to 25.03

MongoDB and NodeJS versions

Starting with Countly 25.03, we use MongoDB 8.0 as the default database version and NodeJS 20 as the default NodeJS version.

This means before upgrading, you need to upgrade your MongoDB to 8.0. You can do that via the script we provide or do it yourself manually. The script should be run on the server where your MongoDB runs. And as always take backups, before you upgrade anything. MongoDB should also be upgraded incrementally, so if you're upgrading from MongoDB 6.0, please don't skip 7.0.

To upgrade from MongoDB 6.0 to MongoDB 7.0 run 

To upgrade from MongoDB 7.0 to MongoDB 8.0 run

Then for Countly Server, to upgrade the NodeJS version to 20.x run

Migrating to new data model with merged events collections

Since we are moving data from multiple dynamic separate collections into single collection, there are some additional steps to be aware of. Please follow the data model migration guide later in this article.

Continue with regular upgrade

Afterward, run the normal upgrade procedure of all upgrade scripts between your version and 25.03 (including bin/upgrade/25.03/upgrade.sh)

Introduction to the New Data Model

The new data model introduces two key collections to simplify data management:

  1. Aggregated Data - All aggregated events from various apps will now reside in a single collection called countly.events_data replacing multiple countly.events_data+HASH collections.
  2. Granular Data - Granular event data will now be collected in a single collection called countly_drill.drill_events, instead of multiple drill_events+HASH collections. After the upgrade, Countly will read from both - old and new collections. Migrating granular data is not mandatory.

To accommodate this consolidation, two new fields have been introduced for granular drill documents:

  • a for indicating the app from which the event originates.
  • e for the event key.

Some fields have been removed to reduce document size. These fields can be derived from the ts field at runtime. This new approach aims to improve performance, simplify data management, and provide better scalability.

Benefits

  • Simplified Data Export - Instead of querying multiple collections for all events, exporting or querying data is now as simple as accessing a single collection.
  • Easier Data Management - Managing indexes, sharding, TTL for data retention, and deleting unused collections becomes significantly easier with a single collection.
  • Improved Performance - Writing to a single collection improves performance due to how MongoDB's WiredTiger engine handles writing operations.
  • No Hard Event Key Limits - While we still recommend keeping event keys manageable, the new data model removes the hard limit on event keys and reduces performance penalties.

How the New Model is Different from the Old Model

Old Model

  1. Structure - Utilized multiple collections for each app and event. This resulted in a complex schema with numerous collections.
  2. Indexing and Sharding - Managing indexes and sharding across many collections was complex and prone to errors.
  3. Data Export - Exporting data required handling each collection individually, making the process inefficient.
  4. Performance - The old MongoDB engine used collection-level locks, which restricted throughput and impacted performance.

New Model

  1. Structure - Consolidates into two main collections: countly.events_data for aggregated data and countly_drill.drill_events for detailed event records.
  2. Indexing and Sharding - Simplified with fewer collections, improving management and performance.
  3. Data Export - Streamlined by querying a single collection, making the export process more efficient.
  4. Performance - Enhanced by reducing the number of collections and leveraging improvements in the new MongoDB engine.

Migrating to the New Data Model

Migration Checklist

Before starting the migration, make sure to verify the following requirements:

  • Disk Space - Ensure at least double the current disk space used by the countly.evens+HASH collections is available.
  • Data Cleanup - Delete old or unnecessary data to reduce the migration size.
  • Custom Indexes & Data Retention - Record any custom indexes or retention policies applied to drill collections, as these will need to be reapplied post-migration.
  • Migration Time—Plan for migration time. Migrating 2 billion aggregated data points can take up to 1 hour.
  • Test Setup - Set up a test environment to confirm that the new Countly setup works with your data exports and queries.

Tasks before upgrading

  • Create backups for data stored in the database
  • Backup countly directory if you are not using the docker
  • Stop incoming data
  • Get the latest Countly release either through the package or the latest docker image.

Steps to Migrate Aggregated Data to the New Data Model

When you run Countly upgrade.sh script, it will only migrate the aggregated data.

Granular data does not need to be migrated. New Countly will write new data into a new collection but will still read data from old collections.

As data duplication, the new collection will take up the same amount of space as all current aggregated event collections.

The upgrade script can also be run later (after the upgrade), or you do not have to stop the incoming data, but it will take 40 times longer for collections that already have newly recorded data.

If this upgrade script is skipped, then data before the upgrade will not be visible in aggregated data sections for events.

If the script exists with one of the messages:

Script failed. Exiting. PLEASE RERUN SCRIPT TO MIGRATE ALL DATA. 'Script failed. Exiting'

It is advised to share the output with the Countly team to check for issues.

If in a final output, there is a line like:

"Failed to merge collections: (NUMBER)”

This means that some of the collections were not fully moved. It has to be checked for errors.

The script can be run multiple times.

Upon successfully running the upgrade script, verify in the dashboard that old data is visible for aggregated collections in the Events -> All Events section, and then old events aggregated data collections can be removed by running this script:

/bin/scripts/data-cleanup/remove_old_events_collections.js

Steps to migrate Drill data to the New Data Model

There are no separate steps necessary. Once you upgrade, data will be written into a new drill collection, but when you query that data, it will be queried from both the new and old collections.

This will increase drill query times by up to 2x since now two collections will need to be queried for each query.

Once older data expires, and you do not need it anymore, you can delete old drill collections by running:

use countly_drill
db.getCollectionNames().forEach(function(collName) { if(collName.startsWith("drill_events") && collName !== "drill_events") { db[collName].drop(); } });

Then, you can disable querying old collections in the "Management -> Settings -> Drill -> Union with data from old collections" to query only single drill collections and get better drill query times.

If you want to migrate old drill data to a new collection to keep it for a longer period of time, please contact Countly support. However, this kind of migration may take a large amount of time (2 billion data points could take up to 100 hours of migration) and require doubling the disk size.

In case of rollback

Aggregated data recorded while the server was switched to the new version will be lost. However, if you use the Enterprise version, it can be regenerated from granular data.

So, if you upgrade, detect issues, and roll back to an older version, all newly recorded drill and aggregated data will not be visible.

Drill data will still be usable when you upgrade, but aggregated data will need to be recalculated.

Before upgrading back to the new version, clear the countly.events_data collection.

mongosh
use countly
db.events_data.drop()

For and for all events, collections run an update to unmark merged collections

mongosh
use countly
db.getCollectionNames().forEach(funciton(colName) { if(collName.startsWith("events") && collName !== "events") { db[collName].updateMany({}, {“$unset”:{“merged”:””}}); } });

Differences in how drill data is stored

Old Data Model

For the old data model, each event key had two collections. Let’s take an example of a website and the event being a login. The operations performed are drills. The old model would look like the following:

App: My Website

  • Event: Clicked button
  • Event: Purchase made

App: My App

  • Event: Login

Results in

  • countly.events5acc585c82ec317a7d00d06a39b9453697a3e84b
  • countly_drill.drill_events5acc585c82ec317a7d00d06a39b9453697a3e84b
  • countly.events5d1c6e6925889e294cff2b135d7b65d66a741688
  • countly_drill.drill_events5d1c6e6925889e294cff2b135d7b65d66a741688
  • countly.events54c2f21f3f8b98e22fc6afe8a3511caed7fc8240
  • countly_drill.drill_events54c2f21f3f8b98e22fc6afe8a3511caed7fc8240

This data model will query all of the collections mentioned above, resulting in reduced latency. So, a new data model has been introduced.

New Data Model

In the new data model, if you use the same example of a website and log in, it will create two collections for all events.

App: My Website

  • Event: Clicked button
  • Event: Purchase made

App: My App

  • Event: Login

Results in

  • countly.events_data
  • countly_drill.drill_events

As you can see above, you have to query one collection only, increasing the latency and throughput.

Granular Document

Old Data Model New Data Model
{
"_id": "04d20438e44d212643fc9d0dcb25643e7f050f241626189010_S_1626185010000_1", "uid": "S", "did": "98039e11-a829-1331-e0bd-15ac946b919d", "lsid": "04d20438e44d212643fc9d0dcb25643e7f050f241626189010_S_1626189010000", "ts": 1626185010000, "cd": { "$date": "2021-07-16T15:38:24.145Z" }, "d": "2021:7:13", "w": "2021:w28", "m": "2021:m7", "h": "2021:7:13:h17", "s": 0, "dur": 0, "c": 1, "up": { ... }, "custom": { ... }, "sg": { ... } }

Properties of the Data Model

The new properties that were added are mentioned below. To learn more about the other properties, click here.

Property Name Definition Example Value
"a" It defines the application ID. “6423f48e7393a0ca3410f42d”
"e" It defines the event. "Login"

FAQs

Can we read from old and new collections at the same time?

Yes, there has been a change in code that allows you to read from both old and new data simultaneously.

Looking for help?