INTRODUCTION
In today’s constantly evolving business and technology landscape, data migration emerges as a pivotal endeavor that empowers organizations to adapt, modernize, and optimize their application landscape and potentially as well its target operating model. Whether driven by the pursuit of system upgrades, business expansions or mergers, regulatory compliance or technological advancements, data migration plays a crucial role in shaping the future of an enterprise.
This journey requires not only a clear understanding of the objectives that drive data migration, but also a meticulous consideration of the requisites that pave the way for a successful integration or merger. From comprehensive planning, clear data governance and quality assurance to risk management, data ownership and stakeholder engagement, each element is a thread in the tapestry of seamless data migration.
This article reflects our insights and experience – gained from working with our financial service clients, executing data migration programs to optimize their operations, or following key acquisitions – and summarizes best practices developed in real life scenarios.
1. THE JOURNEY BEGINS – DATA MIGRATION TYPES AND CHALLENGES
Data migration is typically motivated by diverse business goals, making it a complex endeavor. Those diverse goals require different approaches and subject matter experts in the areas of data architecture, environments, privacy, tooling, and governance, to name but a few. When migrating sensitive data, it requires additional attention and possibly a certain strategy as Know your Costumer (KYC) aspects need to be considered. To ensure a successful data migration, it is crucial to build a common understanding, ease communication and stay on top of the task at hand.
1.1. The rules of the game: data ownership and governance
Ensuring structured policies and accountability is crucial for data migrations. The data owner is responsible for compliance, effective governance and driving implementation of policies and controls. Generally, this entails the following considerations:
To guarantee that data is managed as a critical asset and possible risks can be recognized and addressed, it is crucial to identify ownership, stewardship, and operational structure. Especially when attempting to merge datasets after an acquisition it is important to be certain who owns the data in the acquired company and can give insights into its organization and structure.
1.2. A colorful spectrum: migration types
Depending on the specific type of data migration, different systems will be involved. For instance, a business planning increasing their efficiency by modernizing their systems requires a different kind of migration than a business entering a new market. In Figure 1 we outline the most common systems that are migrated today.
Figure 1: Types of migrations
1.3. More than copying data: challenges and mitigation strategies
The variety of migration types presents different implementation challenges. Below we list the most common challenges, and how they should be addressed.
Identify data streams
Different data elements are usually distributed across separate systems and departments. Similarly, applications and target system may be spread across different instances. Consequently, these data streams must be mapped and consolidated before they are migrated to a new system that encompasses all functionalities.
This challenge can be mitigated by identifying and creating a Golden Data Sourcing Definition, ideally supported by the existing authoritative data sources (ADS), or system of records (SoR). Particularly in large businesses, this necessitates a well-defined approach to data profiling to align results with all stakeholders. Data cleansing additionally helps in maintaining data quality standards, along with unifying data by Domain Value Mapping (DVM).
Lack of data governance
Incomplete or missing standards in respect of the collection, storage, processing, and destruction of data lead to data quality issues, incomplete data procedures and ownership issues, which commonly create challenges for data migrations.
An established data governance plan can prevent such issues by outlining the clear roles and responsibilities of the data migration team, the data quality standards, and the data security procedures.
Data security and compliance
This is one of the biggest data migration challenges that businesses face. During the initiation of the migration process, industry-specific regulations and standards, such as the Payment Card Industry Data Security Standard (PCI DSS), must be understood and compliance must be enforced.
A Migration Audit framework can be a powerful tool for defining all compliance-related requirements that apply to data security and data storage. Furthermore, user data privacy can be protected through the use of data encryption, chain of custody and secure user authentication processes.
Data Sensitivity
Beyond the imperative of safeguarding data privacy, there is a compelling need to address data sensitivity requirements after the transition has occurred. Of particular note is the scenario of migrating customer data across different legal entities in post-merger situations. In these cases, the transfer of customer data may trigger the need to repeat Know Your Customer (KYC) checks and other critical customer onboarding procedures.
A thorough planning phase and migration design is vital to master this challenge, as well as experts on actionable data streams. Focusing on sensitive data may require a particular choice of migration strategy.
Lack of data quality and incomplete data
Outdated data sources and/or human error can lead to incomplete or inconsistent data – another very common challenge during a data migration. Establishing high levels of data quality and completeness is crucial before starting the migration process.
Tagging known issues to critical data elements (CDEs) during data sourcing analysis helps to improve data quality in addition to data cleansing and normalization techniques.
Data incompatibility (source vs target)
Incompatibility between data formats within old and new systems occurs when migrating between different database management systems – e.g. when both systems use different database structures, file formats, naming conventions, or logical relations.
Mastering this challenge requires the choice of the right migration platforms or vendors that best suits the scenario. Tools like Informatica PowerCenter, Microsoft SQL Server Integration Services (SSIS) and IBM Infosphere Datastage make it easy to move data between systems for straightforward data migration projects.
Lack of experienced resources
Bank internal processes and system landscapes are complex and require internal experts, especially when employing proprietary solutions. However, conflicting priorities result in limited availability, causing shortages of resources and consequently delays.
To overcome this challenge, it is crucial that domain experts are hired in the long term or freed from lower-priority tasks in the mid-term. Reserving time with SMEs and decision makers upfront can additionally mitigate risks.
Lack of documentation and information
Time constraints, low prioritization and poor communication with internal/external stakeholders can lead to a lack of documentation and information. This, in turn, may result in insufficient understanding of data, late identification of systems, business units, interfaces that impact on the project scope and timeline.
As a first step, the individuals responsible for the legacy system should be contacted and consulted regarding existing documentation. Next, implementing a powerful documentation tool – such as Confluence – makes the documentation process easier, more traceable and also creates an improved platform for collaboration. It further helps to document, manage and review inputs from stakeholders, migration rules and the data dictionary.
1.4. Migration logistics: mapping sequence and data types
Data migration follows a defined sequence that is governed by the three data types, due to their different functions. Metadata is transferred first, followed by Static Data, and finally Position Data is migrated.
Metadata – Metadata is the descriptive backbone that provides the context, meaning and details of the stored data, and – as it defines the basic configuration of the system – is hence migrated first Examples include categorizing data as payments, employees access rights, and organizational structure.
Static data – Static data refers to information in a database that rarely changes, which is used in target applications. It is migrated in the second step and typically includes information like client name, address, mandates, and accounts i.e. client data.
Position data – Position data refers to information that changes, is highly relevant for business operations, and depends strongly on the data that is migrated and its context. Examples include account balances, pending stock exchange orders, and financial assets or liabilities held on the balance sheet.
2. MOVING DAY – BEST PRACTICE APPROACH FOR MIGRATING DATA
2.1. Assembling the pieces: general framework
To accelerate migration and drive program success, it is paramount to analyze the status quo and have a clear understanding of the goals in order to choose the right approach and build a strategy for the migration process. The process can be broken down into three high-level stages.
Data migration strategy
Defining a comprehensive and widely accepted strategy framework is vital for understanding the business value and preferences and monitoring the outcomes to ensure they align with the intended advantages. Additionally, plans for optimization should be devised to guarantee a refined and efficient process.
Building a comprehensive understanding of the current state is paramount and involves initiating a meticulous assessment of the existing data interdependencies, constraints, usage patterns, and other key attributes. This evaluation should extend across various applications, business divisions, and the stakeholders involved.
During value mapping, the gathered insights are combined with the perceived value of different business goals, to establish criteria for prioritization. These criteria provide a clearer direction for the migration efforts. Within this roadmap, the incorporation of opportunities to streamline and restructure processes presents a chance to enhance overall efficiency.
Data migration and integration playbook
To enable seamless delivery on a large scale requires both the effective handling of technical aspects and the establishment of comprehensive frameworks that are compliant with relevant regulations. Such frameworks span Data Governance, Quality, Financial, and Security Controls.
Adopting DevOps principles for data management is an integral step to fostering streamlined and efficient migration processes. It entails integrating the necessary Continuous Integration/Continuous Deployment (CI/CD) pipelines that facilitate the building, packaging, and deployment of code.
A robust Data Quality Assurance (QA) strategy covering one-time load pipelines and regression testing is vital for guaranteeing the accuracy and completeness of the migrated data. Simultaneously, evaluating the tools and approaches employed for data governance ensures that the chosen methods align with the objectives of the migration.
Successful migration heavily depends on deploying an Ingestion Framework for various source types, that allows for maximum reusability and configurability with minimal code. A strong emphasis should be placed on consistency and metadata-driven approaches and the different strategies for data loading (historical/delta/incremental).
Iterative data migration and integration
Successful large-scale data migration involves the strategic implementation of proven iterative approaches and frameworks. This encompasses following, adapting, and refining established patterns through iterations, which is crucial for fine-tuning the migration strategy to fit the unique scenario. Additionally, the deployment of specific utilities and thorough testing of data loads contribute to a seamless migration experience.
For scalable delivery, iterative Proof of Concept (POC) phases are essential to ensure a smoother and more efficient migration process. These POCs serve as validation points for the ingestion patterns and lay the groundwork for development of essential code libraries.
Of particular importance is the handling of historical data loads and building robust data pipelines. This shifts the focus to constructing minimally disruptive, automated scripts and Extract, Transform, Load (ETL) processes, in harmony with frameworks and designs.
Operationalization is the final phase of the process, where knowledge transfer and documentation activities are executed in line with the established operating model. This phase ensures that the new system operates effectively and can be managed by the relevant teams. Simultaneously, the old system is phased out during the decommissioning phase.
2.2. Trickle or Big Bang: implementation approaches
Different approaches to implementation will address the various needs, scenarios, or challenges encountered during the process of data migration. Similarly, factors such as the nature of the data, the source and target systems, business requirements and technical constraints will dictate the optimal approach. The following implementation dimensions help with choosing the appropriate approach (see Figure 2):
Figure 2: Different dimensions of implementation approaches
The choices between these approaches should be based on a thorough understanding of the specifics of individual data migration projects. It is important to assess the trade-offs, risks, and benefits associated with each approach and select the one that best aligns with the organization’s requirements.
Some migration campaigns explicitly aim at improving the data quality, including deduplication, cleansing, and validation. As a subset of every data migration project, this topic is touched upon in one of the migration phases outlined below (see Figure 3).
2.3. Paving the road: migration phases
Despite all the differences between the approaches, every migration can be broken down into the following agile phases:
Figure 3: Agile phases of a migration
3. CONCLUSION
Data migration requires a carefully planned, strategically executed, and well documented approach. Complying with these principles enables businesses to mitigate risks, maximize business continuity and master the multi-faceted nature of data migration projects.
The inherent complexity of such projects, however, means there are additional opportunities for improving business KPIs, processes and operational agility, with areas such as data quality, security and scalability providing mechanisms for further future-proofing businesses.
The close relationship between data migration and data governance must be emphasized once more. Successful migration hinges upon a solid data governance framework that ensures data integrity, compliance, and traceability throughout the migration journey.
By embracing the insights shared in this white paper, stakeholders can navigate the complexities of data migration with confidence, ensuring that data remains a driving force in their businesses.
Capco specializes in providing comprehensive assistance for the development and orchestration of end-to-end data migration frameworks including its execution by also taking responsibility and being accountable for its completeness, and meeting deadlines and quality. Drawing upon our extensive experience, we have built in-house accelerators through years of involvement in numerous PMI and data migration initiatives. Our commitment to taking ownership and ensuring accountability is unwavering, as we diligently adhere to our "jobs to be done" approach.
Contact us to discuss your data migration vision and challenges.
References
Capco insights and experiences on M&A, post-merger integration and data migration to lead financial services firms to success.
INTEGRATE, INNOVATE & ACCELERATE