Got Legacy?

Entries tagged as ‘data migration’

How to Win Friends and Migrate Data

August 26, 2008 · Leave a Comment

Migration is not just for the birds

Data migration is like everything else – there is a right way and a wrong way to do it. Actually there is a right way for every scenario, and many, many wrong ways! This article will help you consider the alternatives.

What is data migration?

Just to make sure we’re comparing apples to apples, we had better define our terms. At times like this, I turn to Wikipedia:

Data migration is the process of transferring data between storage types, formats, or computer systems. Data migration is usually performed programmatically to achieve an automated migration, freeing up human resources from tedious tasks. It is required when organizations or individuals change computer systems or upgrade to new systems, or when systems merge (such as when the organizations that use them undergo a merger/takeover).

Can you imagine doing this by hand?? Nowhere in the world are there workers desperate enough to take on that task uncoerced. Hence the many automated tools on the market – computers will do anything.

What to look for

When you’re looking for a solution to your data migration problem, make sure that you consider a few key features that can make the difference between success and disaster.

1. Enables schema changes

At the end of the day, you want your mission-critical data organized in a rational way that makes it easy to get the data out in a form that serves your business. Trouble is, most legacy data stores are anything but rational. They have grown organically over the past 20 years like English Ivy next to a leaky nuclear reactor – not in an organized way, but with a certain craziness. They tend to exhibit “archeological layering”, so you can see where a certain field switched from use A to use B without actually changing names. The bottom line is that if you have a chance to fix your underlying data model, you should do it. It will make your life much easier in the future. Look for a toolset that allows a flexible and powerful assortment of transformations, including table splitting and merging, rationalization of foreign key relationships, extraction of value lists, etc.

2. Declarative

I hesitate to trot out the comp-sci jargon, but you want a data migration tool that stores your transforms as a set of rules, not as a script. Writing a bunch of scripts is kind of the old fashioned way to get the job done. And sometimes those scripts can get pretty huge and ugly. After all, it’s a programming task in itself, and not a pretty one. Do yourself a big favor and use a specialized tool that has some understanding of the fundamentals of the problem and stores your migration rules as rules, not scripting statements.

3. Iterative

Admit it – you will not get this right the first time. Trying to get it right the first time would be a huge waste of time, and probably lead you down the garden path to “analysis paralysis.” You need to get something done, see how it worked, learn from it, learn your lessons, make changes and do it again. Lather, rinse, repeat. You want a toolset that lets you run your data migration, inspect the results, and facilitate the iterative refinement of the script.

4. Auditable

In some environments this is not a nice-to-have, it’s the law. We can’t go losing our patient drug allergy records or our policy line items or (God forbid) our deposit and withdrawal records! And even in less regulated niches, it’s a good idea to be able to show where every record came from, and where every record went.

5. High performance

This is probably obvious, but when you’re cutting over from the old environment to the new, you will not have weeks and weeks go do it. You’ll be lucky to get 24 hours. And you may have hundreds of millions of records to read, cleanse, transform, re-insert, validate and audit in that all-too-brief period of time. Your tool had better be up to the task!

Integration: the killer feature while modernizing

It’s hard to overstate how useful a good data migration toolset is while you’re modernizing a large legacy system, especially if it’s integrated properly with your modernization toolset.

The first huge benefit is test data. While you’re developing, it makes a world of difference to be able to try out your new screens, batches and reports with actual (not just realistic, but actual) data. So the earlier it’s available in the life cycle, the better. Woe betide those who leave data migration to the end of the project, for they have missed a big opportunity.

Integration is a two-way street as well, because it’s extremely valuable to feed the inevitable screen and interfaces changes back into the modernized schema and the transformation rules, so that you can keep your test data happening with limited fuss.

Yes we built one

I’m sure you will not be shocked to discover that MAKE has just such a tool, designed from the ground-up with data migration in mind, integrated with the rest of our modernization suite. But don’t take my word for it – use your new found knowlege and perspective to evaluate all the data migration tools on the market before making any kind of decision.

- Tom Metzger

Categories: data migration
Tagged: , , ,

Hidden Gotchas: Batches and Reports in a legacy modernization project

May 23, 2008 · Leave a Comment

Introduction: One of the common challenges faced when attempting to modernize an old legacy application that is heavily rooted in the business is to envision how the disparate parts of the new system will interact together to provide a similar business solution to the one the legacy provided.

Examples: Let’s review a couple examples of pitfalls that could become major obstacles in the way of successful delivery:

Challenge 1: Hey, where did my extract go?

Some legacy Batches (programs that are scheduled to run unattended by users) extract newly entered records according to certain criteria to a flat-file. This process is followed by another batch that reads the extracted records, updates them according to certain logic, then populates some other table in the system with them (see figure 1).

Figure 1

A Modernization Specialist (a Developer in the role of System Analyst) modernizing such a set of Batches can reasonably design a modernized System-Event (a program that may have no interface and is usually scheduled to run unattended) that simply skips the extract part and runs a script to insert the records into the destination table with the required updates (see figure 2). Meanwhile, the System Architect (in charge of making core, high-level system design decisions) might have reached an agreement with a 3rd-party that expects to be fed with the flat-file extract to keep sending them the file as is (not being aware of the history behind the file generation).

Once the modernized System-Event is built, there will be no flat-file generated which means that the 3rd-party would not be receiving their file. This kind of problem is actually hard to detect, because all parties have assumed no changes to the existing extract and hence minimal follow-ups.

Prevention: Clear communication between System Analysts during the design phase minimizes the chance for such pitfalls to happen. The System Architect can lead the System Analyst group in one or more sessions to secure against missing pieces.

Challenge 2: Report numbers look all funny!

While most legacy reports are produced by the programming language code as part of the program, the reporting technology to be used in a modernized application is very likely to incorporate a reporting server (Cognos, Crystal Reports, etc.). This presents the challenge of synchronizing modernized application programs calls to related modernized reports.

Consider the situation where a legacy Batch made of 5 legacy programs, each producing 1 legacy report being modernized into only 2 modernized programs. If we don’t keep track of which legacy report is executed when, and ensure that we are calling the modernized reports from the equivalent stages of modernized System-Event processing, the report numbers will not match.

Prevention: We have to start by proper documentation of legacy reports and where they are called from. This is to be followed by documenting the modernized reports and their parameters (we may combine a few reports together, ignore some reports and generate new ones) and finally properly linking the modernized reports definitions to the legacy ones in order to empower the developers modernizing the Batch code with enough information to make the right report calls at the right time.

Conclusion: A legacy modernization project can be a huge undertaking. Paying attention to details and making the right modernization decisions is not enough to ensure successful delivery. You may modernize every part of an application, but the new pieces may still not work together to deliver the same solution the legacy application delivered, hence delivery failure!

The modernization methodology should be comprehensive enough to ensure that the big picture is not blurred by all the details.

- Sinan Ali

Categories: Business
Tagged: , , , , ,