Seth Grimes talks about the claim that "80% of business-related information resides in unstructured form, primarily text." I remember this being an important element of discussions of information management (and into knowledge management) as I was getting into the topic.
BridgePoint Experts' Corner: Unstructured Data and the 80 Percent Rule .
[snip] It does seem obvious that a very high proportion of data is unstructured: How much of your workday is spent reading or writing e-mails, reports, or articles and the like, in conversations, or listening to live or recorded audio? And in making the case for tapping unstructured sources, a very important asset in fields ranging from customer experience management to counter-terrorism, it’s helpful to be able to quantify the proportion, to put a number on it.
I like that he's taken the time to explore source of this claim, which appears to be more-or-less correct. But even more important, why is it interesting?
As I read through Seth's discovery, the thing that I thought is that at one point ALL data is "unstructured" because we can only add structure to it when we build a narrative around it. Of course, I know that databases are "structured" in that i can find a phone number vs. a fax number, if the fields are labeled properly. But the connections that I can draw about that data can really only be expressed in words and language. And this is when I add my own structure to the data, as well as unstructuring it from the formal rows and columns of a data cube.
[Yes, I know I am playing loose with information architecture concepts. Please forgive.]