Why is it that people seem to feel that they need to get the platform and the technology in place and tested before they ever work on the organization of their information and cleaning up the data? They do not look at the data and what the data needs to be best used by the user community. I am repeatedly getting told that the DTD/schema for the data and the taxonomy implementation have been pushed back because they are working on implementing the platform. What do they think the platform is for? Without well organized and well formed test data, how will they even begin to know that the system will be able to work well for their information? How do they know that the user can find anything in the proposed system if they have no idea what the data is like? Where are the taxonomy terms going to go in the database or the record or the system? How are they bound to the information they reference? Building the platform first is like buying a pair of shoes without knowing either the size of the foot or the occasion for which they are to be worn. Rant! Rant!

How would I rather see the implementation? The five steps for a solid platform implementation are:

1.  Users first. What do they use now? You can find that out from what they check out or view in the current system. What kind of data do they need? What are their null queries on the current system? What are they looking for and cannot find? What are they calling their colleagues to find since it is not available on the system? Ask them what they would like to have and cross check their answers by looking at what they are doing now.

2.  Data second. What do you have now? What is it like? How is it currently organized? How is it named? What are the data types and formats? How many collections are there? Should they be merged? How clean is the data?

3.  Third, mapping the ideal solution. I know that the “ideal” is not always achievable, but you cannot even begin to get part way to the ideal if you don’t know what the ideal is in the first place. What would the ideal user environment be? What would be the preferred data offering? How do you want to display it, search it, organize it? The taxonomy is important to all three of these factors. So map it out early.

4.  Fourth, specify the ideal. Create the DTD/schema for the ideal data set. An XML schema has two main parts to it: 1) The elements (fields) you want included, and 2) the attribute tables, which tell you how you will fill those elements. How will you organize it?

5.  Lastly, assess the road map to the ideal. How flexible should the model be? Is the model extensible? What is the current status? How much of the ideal do you have now? What do you need to add? Does it exist? Who has it? Build it or buy it? What are the semantic enhancements going to be? How will they be accommodated in the platform? Can the platform handle the taxonomy? How big will it be? How will it be applied? Where do the taxonomy terms get placed, in the system or the data record? It is amazing to me how many people plan to implement a taxonomy but have no way to attach a taxonomy term to an information object, article or record on the platform. They have no way to search them, no way to display them. “We knew they were important, planned for them the whole time….” Okay – where do you want them to go?

Clean up the data to match the new XML schema. Plan to extend the schema as the data becomes known. The best XML schema in the world will need to be augmented by the real data. Casting it in stone as the systems guys want to do will only cause you heartache. See what the data is really like, build a schema to codify it, and then build the database platform. Building the platform first will mean that you have to shoehorn the data into something that is not quite the same shape as the data. Data bunions quickly result!

If you plan early to accommodate the data itself, most delays and headaches of the normal platform implementation are avoided.

Save yourself a LOT of money and heartache! Look at the data and how it is going to be semantically enhanced FIRST. Find out what your users need want and will actually use. Then do the platform – any other approach is backwards.

Access Innovations can help you assess your data and your user needs, and can build a specification for platform implementation that will work the first time.

Marjorie M.K. Hlava
President, Access Innovations