The latest version, Countly 25.03, has been released, and this migration guide will help you navigate the upgrade process smoothly. This document outlines the important changes, benefits, and steps to ensure a smooth transition to the latest version.
Before Upgrading to 25.03
MongoDB and NodeJS versions
Starting with Countly 25.03, we use MongoDB 8.0 as the default database version and NodeJS 20 as the default NodeJS version.
This means before upgrading, you need to upgrade your MongoDB to 8.0. You can do that via the script we provide or do it yourself manually. The script should be run on the server where your MongoDB runs. And as always take backups, before you upgrade anything. MongoDB should also be upgraded incrementally, so if you're upgrading from MongoDB 6.0, please don't skip 7.0.
To upgrade from MongoDB 6.0 to MongoDB 7.0 run
To upgrade from MongoDB 7.0 to MongoDB 8.0 run
Then for Countly Server, to upgrade the NodeJS version to 20.x run
Migrating to new data model with merged events collections
Since we are moving data from multiple dynamic separate collections into single collection, there are some additional steps to be aware of. Please follow the data model migration guide later in this article.
Continue with regular upgrade
Afterward, run the normal upgrade procedure of all upgrade scripts between your version and 25.03 (including bin/upgrade/25.03/upgrade.sh)
Introduction to the New Data Model
The new data model introduces two key collections to simplify data management:
-
Aggregated Data - All aggregated events from various apps will now reside in a single collection called
countly.events_data
replacing multiplecountly.events_data+HASH
collections. -
Granular Data - Granular event data will now be collected in a single collection called
countly_drill.drill_events
, instead of multipledrill_events+HASH
collections. After the upgrade, Countly will read from both - old and new collections. Migrating granular data is not mandatory.
To accommodate this consolidation, two new fields have been introduced for granular drill documents:
-
a
for indicating the app from which the event originates. -
e
for the event key.
Some fields have been removed to reduce document size. These fields can be derived from the ts
field at runtime. This new approach aims to improve performance, simplify data management, and provide better scalability.
Benefits
- Simplified Data Export - Instead of querying multiple collections for all events, exporting or querying data is now as simple as accessing a single collection.
- Easier Data Management - Managing indexes, sharding, TTL for data retention, and deleting unused collections becomes significantly easier with a single collection.
- Improved Performance - Writing to a single collection improves performance due to how MongoDB's WiredTiger engine handles writing operations.
- No Hard Event Key Limits - While we still recommend keeping event keys manageable, the new data model removes the hard limit on event keys and reduces performance penalties.
How the New Model is Different from the Old Model
Old Model
- Structure - Utilized multiple collections for each app and event. This resulted in a complex schema with numerous collections.
- Indexing and Sharding - Managing indexes and sharding across many collections was complex and prone to errors.
- Data Export - Exporting data required handling each collection individually, making the process inefficient.
- Performance - The old MongoDB engine used collection-level locks, which restricted throughput and impacted performance.
New Model
-
Structure - Consolidates into two main collections:
countly.events_data
for aggregated data andcountly_drill.drill_events
for detailed event records. - Indexing and Sharding - Simplified with fewer collections, improving management and performance.
- Data Export - Streamlined by querying a single collection, making the export process more efficient.
- Performance - Enhanced by reducing the number of collections and leveraging improvements in the new MongoDB engine.
Migrating to the New Data Model
Migration Checklist
Before starting the migration, make sure to verify the following requirements:
-
Disk Space - Ensure at least double the current disk space used by the
countly.evens+HASH collections
is available. - Data Cleanup - Delete old or unnecessary data to reduce the migration size.
- Custom Indexes & Data Retention - Record any custom indexes or retention policies applied to drill collections, as these will need to be reapplied post-migration.
- Migration Time—Plan for migration time. Migrating 2 billion aggregated data points can take up to 1 hour.
- Test Setup - Set up a test environment to confirm that the new Countly setup works with your data exports and queries.
Tasks before upgrading
- Create backups for data stored in the database
- Backup countly directory if you are not using the docker
- Stop incoming data
- Get the latest Countly release either through the package or the latest docker image.
Steps to Migrate Aggregated Data to the New Data Model
When you run Countly upgrade.sh script, it will only migrate the aggregated data.
Granular data does not need to be migrated. New Countly will write new data into a new collection but will still read data from old collections.
As data duplication, the new collection will take up the same amount of space as all current aggregated event collections.
The upgrade script can also be run later (after the upgrade), or you do not have to stop the incoming data, but it will take 40 times longer for collections that already have newly recorded data.
If this upgrade script is skipped, then data before the upgrade will not be visible in aggregated data sections for events.
If the script exists with one of the messages:
Script failed. Exiting. PLEASE RERUN SCRIPT TO MIGRATE ALL DATA. 'Script failed. Exiting'
It is advised to share the output with the Countly team to check for issues.
If in a final output, there is a line like:
"Failed to merge collections: (NUMBER)”
This means that some of the collections were not fully moved. It has to be checked for errors.
The script can be run multiple times.
Upon successfully running the upgrade script, verify in the dashboard that old data is visible for aggregated collections in the Events -> All Events section, and then old events aggregated data collections can be removed by running this script:
/bin/scripts/data-cleanup/remove_old_events_collections.js
Steps to migrate Drill data to the New Data Model
There are no separate steps necessary. Once you upgrade, data will be written into a new drill collection, but when you query that data, it will be queried from both the new and old collections.
This will increase drill query times by up to 2x since now two collections will need to be queried for each query.
Once older data expires, and you do not need it anymore, you can delete old drill collections by running:
use countly_drill
db.getCollectionNames().forEach(function(collName) { if(collName.startsWith("drill_events") && collName !== "drill_events") { db[collName].drop(); } });
Then, you can disable querying old collections in the "Management -> Settings -> Drill -> Union with data from old collections" to query only single drill collections and get better drill query times.
If you want to migrate old drill data to a new collection to keep it for a longer period of time, please contact Countly support. However, this kind of migration may take a large amount of time (2 billion data points could take up to 100 hours of migration) and require doubling the disk size.
In case of rollback
Aggregated data recorded while the server was switched to the new version will be lost. However, if you use the Enterprise version, it can be regenerated from granular data.
So, if you upgrade, detect issues, and roll back to an older version, all newly recorded drill and aggregated data will not be visible.
Drill data will still be usable when you upgrade, but aggregated data will need to be recalculated.
Before upgrading back to the new version, clear the countly.events_data
collection.
mongosh
use countly
db.events_data.drop()
For and for all events, collections run an update to unmark merged collections
mongosh
use countly
db.getCollectionNames().forEach(funciton(colName) { if(collName.startsWith("events") && collName !== "events") { db[collName].updateMany({},{“$unset”:{“merged”:””}}
); } });
Differences in how drill data is stored
Old Data Model
For the old data model, each event key had two collections. Let’s take an example of a website and the event being a login. The operations performed are drills. The old model would look like the following:
App: My Website
- Event: Clicked button
- Event: Purchase made
App: My App
- Event: Login
Results in
countly.events5acc585c82ec317a7d00d06a39b9453697a3e84b
countly_drill.drill_events5acc585c82ec317a7d00d06a39b9453697a3e84b
countly.events5d1c6e6925889e294cff2b135d7b65d66a741688
countly_drill.drill_events5d1c6e6925889e294cff2b135d7b65d66a741688
countly.events54c2f21f3f8b98e22fc6afe8a3511caed7fc8240
countly_drill.drill_events54c2f21f3f8b98e22fc6afe8a3511caed7fc8240
This data model will query all of the collections mentioned above, resulting in reduced latency. So, a new data model has been introduced.
New Data Model
In the new data model, if you use the same example of a website and log in, it will create two collections for all events.
App: My Website
- Event: Clicked button
- Event: Purchase made
App: My App
- Event: Login
Results in
countly.events_data
countly_drill.drill_events
As you can see above, you have to query one collection only, increasing the latency and throughput.
Granular Document
{
"_id": "04d20438e44d212643fc9d0dcb25643e7f050f241626189010_S_1626185010000_1", "uid": "S", "did": "98039e11-a829-1331-e0bd-15ac946b919d", "lsid": "04d20438e44d212643fc9d0dcb25643e7f050f241626189010_S_1626189010000", "ts": 1626185010000, "cd": { "$date": "2021-07-16T15:38:24.145Z" }, "d": "2021:7:13", "w": "2021:w28", "m": "2021:m7", "h": "2021:7:13:h17", "s": 0, "dur": 0, "c": 1, "up": { ... }, "custom": { ... }, "sg": { ... } }
{
"_id": "04d20438e44d212643fc9d0dcb25643e7f050f241626189010_S_1626185010000_1", "a": "6423f48e7393a0ca3410f42d", "e": "Login", "uid": "S", "did": "98039e11-a829-1331-e0bd-15ac946b919d", "lsid": "04d20438e44d212643fc9d0dcb25643e7f050f241626189010_S_1626189010000", "ts": 1626185010000, "cd": { "$date": "2021-07-16T15:38:24.145Z" }, "d": "2021:7:13", "w": "2021:w28", "m": "2021:m7", "h": "2021:7:13:h17", "s": 0, "dur": 0, "c": 1, "up": { ... }, "custom": { ... }, "sg": { ... } }
Properties of the Data Model
The new properties that were added are mentioned below. To learn more about the other properties, click here.
Property Name | Definition | Example Value |
"a" | It defines the application ID. | “6423f48e7393a0ca3410f42d” |
"e" | It defines the event. | "Login" |
FAQs
Can we read from old and new collections at the same time?
Yes, there has been a change in code that allows you to read from both old and new data simultaneously.