• subscribe
March 25, 2010 09:29 AM

Could Unstructured Data Management Technology Replace the Relational Database Someday?

SQL Server Pro
InstantDoc ID #104665

I’m going to say something that will make many of you mad and wonder if I’m a complete idiot—maybe someday we won’t need relational databases. Let me be clear about a few things before going further. First, I don’t think we’re anywhere close to saying “Thanks Mr. Relational Database, but we don’t need you anymore.” Second, I’ve known the basic rules of normalization longer than some of you have been alive. Trust me, I get databases. But maybe, just maybe, technology is reaching the point in which some data management paradigms that used to be shoehorned into a relational database no longer need to be. Here’s what got me thinking these crazy and heretical thoughts.

I joined the iPhone nation this past November. I don’t want to say it’s changed my life, but honestly, having a useful browser in your pocket and all of the apps is pretty darn cool. Evernote is now one of my favorite apps. It’s sort of like Microsoft OneNote except 1,000 percent better. Evernote is one of those newish classes of applications that basically sticks everything in a bucket and lets you tag your entries. Calling my data in Evernote “unstructured” is actually being generous. It’s antistructure and more akin to having a big filing cabinet and simply throwing random pieces of paper in it. But the indexing and tagging actually work. It works well and fast. I keep all of my receipts in it. I keep business cards in it from conferences. I keep meeting notes in it. More and more, I’m keeping an awful lot of stuff in it. And frankly, the simple tagging model works amazingly well. Evernote can do text recognition from pictures and PDF files. It’s amazing how quickly and easily I can find the stuff I’m looking for. And did I mention that my Evernote data is in the cloud, so it’s magically always there regardless of which device I happen to be using?

I don’t expect that a big telephone company is going to put their call records in a tag-based bit bucket in the cloud any time soon. But I can certainly see certain classes of corporate data management needs being well addressed by these approaches.

And have you ever heard of a company called Google? Sometimes it scares me how smart it is at guessing the full phrase I’m going to type in a search box. Indexing? Fast access to data? Yeah, Google (and Bing) are pretty good there. What if you could Google “Show me the 3rd quarter sales numbers for the magenta widgets” and the search engine knew exactly what you were talking about through the clever contextual knowledge it has about you?

I don’t think bit buckets will ever replace “real” databases. Locking, concurrency, high availability, one version of the truth, and a few semesters worth of other database topics make me believe we will always need databases. Will databases always be relational? Does it matter? I suspect many problems in the database world will always be best described using relational math and set theory. However, many database pros are at risk of having their heads in the sand when it comes to some of the newer trends involving unstructured data, including, but not limited to, SharePoint. It’s time to pay attention. Your users and customers are.



ARTICLE TOOLS

Comments
  • Charles Phillips
    2 years ago
    Apr 05, 2010

    What you've described has been around since the early 1990's in Lotus Notes. Instead of "records," Notes has "documents" and you can stuff anything into them because they are not defined by columns. If I want to add a new field to just one document, I can do it. If I want to attach files to a document, it's easy. It's also quite fun because it gives RDBMS guys conniption fits - they just don't get it.

    Notes does a terrible job simulating structured data, but if structure isn't required, then for better or worse, it gives you a lot of freedom.

  • BARKER
    2 years ago
    Apr 05, 2010

    Great article, Ive been thinking about this same topic most recently. I'm a MCTS in SQL Server/BI and I too am looking at the cloud technology and what the means for me. More specifically, using SQL Azure vs. Windows Table storage. I may see that as moving from structured to unstructured, but Im in the novice stage of the cloud, so Im still learning. One area Im trying to learn more about is using the Entity/Properties and getting the data I need in the same Partition in the cloud, etc. It does look like I need to embrace the Borg.

  • John D. Lambert
    2 years ago
    Mar 26, 2010

    While user/DBA involvement may change, relational data management will never go away because it's based on the mathematical reality that some data elements are related to other data elements in one-to-one, one-to-many, or many-to-many relationships. Advanced management of erroneously named "unstructured" data (which is actually highly structured) is primarily based on deducing and extracting metadata that is managed in a relational manner. Data management engines may change radically, and artificially intelligent data systems may eventually make database administrators obsolete, but even then, most data with simple relationships will still be managed with relational rules.

  • Kohlmiller
    2 years ago
    Mar 25, 2010

    I think of it this way: in a hospital we need the relational data structure for most of the data. But the strategic analysis is at some point unstructured. Basically, you have a lot of reports and it is the results of those reports (with some parameterization) that are the unstructured data. What are top 10 meds for AMI patients? Compare that data last quarter versus a year ago. What's the difference in costs? in length of stay? in mortality? Any physicians stand out? Any other big change in how we treat these patients?
    Now take the beginning of this list of questions - what data structure answers all of them the best? A cube? An RDBMS? Or something that has a different structure OR is largely unstructured yet ready for these kinds of questions?

  • Geoff Ambler
    2 years ago
    Mar 25, 2010

    Couldn't agree more. Increasingly, corporate data is being created outside of ledgers and other similar structured formats, and in volumes which challenge traditional architectures. Can't see RDBMS being mothballed any time soon because of the rigour within their underlying model, but their slice of the data management pie will certainly decrease.

You must log on before posting a comment.

Are you a new visitor? Register Here