“We see a clear opportunity to create value through better management and use of unstructured content.” said Viral Gandhi, CIO, Cox & Kings (CIOL news).
What is he talking about? And why is is important?
The value of structured data is well known. ERP systems love that data! What would we do without invoices, purchase orders, bill of materials, and asset management data? Or CRM systems – they love structured information about our customers, business partners and marketing opportunities.
What is unstructured data?
Everybody produces vast amount of unstructured information. Let’s just look at email. From ‘for your information’, ‘for your action’, and ‘feedback or ‘asking for help’, to communication, delegation, and amusement. Email is structured in terms of sender, recipient, subject, and body but not in terms of content. The same applies to documents we produce, be it reports, spreadsheets, presentations or media files.
All this information has value. Most of the high value is hidden in between the vast amount of transient information. And often only recognised by people “in the know”. That is they know where the information is likely to be found and what hints to look for. Most of us don’t have the time or inclination to look for information this way.
Making value visible
The obvious first answer is meta data! And this has been proposed and tested for many years. The fact that we are still talking about it means , it didn’t work. Previously I mentioned the 7 second test, and I’m pretty sure others had the same or a similar stumbling block. The trick is to
- concentrate on the essential attributes (3 is a good number to aim for)
- provide preselected options that ensures the value items are identified and have consistent spelling
- don’t bother the user with automatically filled in fields (show those once and let the user confirm)
- options need to be mutually exclusive and well understood (the link shows the 50 most common symbols around the world).
- make those few mandatory
There are many more tricks and good practices to identify information value. The next step is to find that again. I leave that for another time.