There is an interesting article at regdeveloper.com titled:
Structured data is boring and useless.. This article provides insight into a serious point of confusion about what exactly is structured vs. unstructured data. Here is a key excerpt:
"We all know that structured data is boring and
useless; while unstructured data is sexy and chock full of value.
Well, only up to a point, Lord Copper. Genuinely unstructured data
can be a real nuisance - imagine extracting the return address from
an unstructured letter, without letterhead and any of the
formatting usually applied to letters. A letter may be thought of
as unstructured data, but most business letters are, in fact,
highly-structured." ....
Duncan Pauly, founder and chief technology officer of Coppereye add's eloquent insight to the conversation:
"The labels "structured data" and "unstructured
data" are often used ambiguously by different interest groups; and
often used lazily to cover multiple distinct aspects of the issue.
In reality, there are at least three orthogonal aspects to
structure:
* The structure of the data
itself.
* The structure of the container that
hosts the data.
* The structure of the access method
used to access the data.
These three dimensions are largely independent and one does not
need to imply another. For example, it is absolutely feasible and
reasonable to store unstructured data in a structured database
container and access it by unstructured search
mechanisms."
Data understanding and appreciation is dwindling at a time when
the reverse should be happening. We are supposed to be in the
throws of the "Information Age", but for some reason this appears
to have no correlation with data and "data access" in the minds of
many -- as reflected in the broad contradictory positions taken re.
unstructured data vs structured data, structured is boring and
useless while unstructured is useful and sexy....
The difference between "Structured Containers" and "Structured
Data" are clearly misunderstood by most (an unfortunate fact).
For instance all DBMS products are "Structured Containers"
aligned to one or more data models (typically one). These products
have been limited by proprietary data access APIs and underlying
data model specificity when used in the "Open-world" model that is
at the core of the World Wide Web. This confusion also carries over
to the misconception that Web 2.0 and the Semantic/Data Web are
mutually exclusive.
But things are changing fast, and the concept of multi-model
DBMS products is beginning to crystalize. On our part, we have
finally released the long promised "OpenLink Data Spaces"
application layer that has been developed using our Virtuoso Universal
Server. We have structured unified storage containment exposed
to the data web cloud via endpoints for querying or accessing data
using a variety of mechanisms that include; GData, OpenSearch,
SPARQL, XQuery/XPath, SQL etc..
To be continued....