Data Cleaning and Data Migration

This guidance describes the process of cleaning and migrating data in Primero.


In order to ensure that data is safely and securely transferred from legacy systems, alternative systems or from version to version, the Primero team works with in-country partners to design and implement safe and secure transfer of existing and future case records by:

  1. Assessing the volume, type and format of the existing case data at an organizational level
  2. Designing and offer appropriate methods of transferring each organization’s data
  3. Ensuring that the quality of the data is high and that records are assigned to the correct team/users

Data cleaning process :

Data should be cleansed before migrating, including the removal of duplicates and records that have data quality issues.

When proceeding for data migration, each organization involved in case management should take the time to go through their current caseload and close cases that are no longer active, following these rules(1):

  • Most often cases are closed when the goals of the child and family, as outlined in the supported, and there are no additional concerns.
  • The family / child no longer want support and there are no grounds for going against their wishes (i.e. provided this is safe for the child)
  • The child is turns 18 years old (2)
  • The child dies

Data cleaning should be organized on regular basis even after data migration, ensuring data quality.

(1) Case Management Inter Agencies Guidelines, CPWG, 2014. “The specific criteria for when a case can be closed should be identified as part of the SoP.

(2) Case Management Inter Agencies Guidelines, CPWG, 2014. “When a child turns 18, it is important to prepare for this and support the child to identify what this means and where they can go to for continued support should this be required and/or wanted.”

Data migration process :

The options of migration include:

1) Data entry by case workers and/or supervisors:

  • Case works are responsible for adding caseloads using the web application
  • Each case worker is assigned a username and password and is given a computer with internet access
  • Each case worker brings paper or electronic records of their caseload to a central location to enter the data into the web application

Pros for this option:

  • This is an excellent opportunity to get case workers familiar with the CPIMS+ web app they will routinely be required to use when it goes live
  • Data does not need to be highly structured; can be from notes or forms
  • Case workers know the records, so should be good quality data transfer
  • Highly confidential, low cost

Cons for this option:

  • Labour intensive; expect one case worker to be able to enter one case every 30 minutes; may take case workers out the field for 3-5 days

2) Data entry by hired data entry staff:

  • UNICEF or other agencies hires staff to do data entry of existing caseloads into the web application
  • Data entry teams are assigned “write-only” roles and are supervised during data entry on provided computers
  • Case workers provide paper or electronic records of their caseload to a central location, and provide to the data entry staff

Pros for this option:

  • This is the fastest way to get “unstructured” data into the CPIMS+
  • Supervisors from each organization to oversee the work and learn oversight role
  • No interruption of services in the field due to case workers doing data entry
  • Acceptable cost

Cons for this option:

  • Less confidential; efforts can be made to ensure privacy, but some exposure here
  • Lost opportunity for capacity building
  • All cases will need to be reassigned because data entry users aren’t record owners

3) Electronic data migration via scripts:

  • The software developer vendor is contracted to provide services for electronic migration of data
  • Organizations provide clean, structured data to the software developer vendor via password protected file (i.e. excel)
  • The software developer vendor maps current data structure to application structure and then migrates specific and limited set of data points from excel to the database

Pros for this option:

  • This is the fastest way to get large volumes of structured, clean data into the CPIMS+
  • This work is covered by the institutional contract with the software development vendor
  • No interruption of services in the field due to case workers doing data entry

Cons for this option:

  • This option not available equally to all organization (data must be structured)
  • Cannot capture nuanced data, text fields
  • Less confidential; efforts can be made to ensure privacy, but some exposure here
  • Lost opportunity for capacity building
  • All cases will need to be reassigned because data migration can’t assign record owners
  • Cost
  • Manual data entry may be required if migration only includes limited fields

If electronic data migration is chosen, then the Primero Developers and System Administrators follow this process to migrate data in Primero:

  1. Data will need to be exported from the old instance. The development team has specific templates and formatting that they follow for this migration.

  2. The system administrator is first put in contact with the Primero administrator. Together they will coordinate sharing the configuration bundles. Once this configuration has been shared no form changes can be made, although Primero instances are still able to add and change users.

  3. The development team will upgrade the configuration.

  4. The development team will then create and run a migration script to migrate all the data from v1.x (or CPIMS) to v1.x

  5. The development team will go through a process of quality assurance to ensure the newly upgraded configuration is working as expected with UNICEF ICTD as well as to validate that the migration/data points is successful.

  6. Once the upgraded configuration has passed quality assurance, then a clean deploy will take place to the production server. For 1-3 days no users can use the system and during that time we request that system administrators and the in-country team verify that the migration has been correct and they can check the cases and ensure the data points look the same.

  7. This process will take about 4 – 8 weeks.

2 Likes