• subscribe
October 28, 2010 11:00 AM

RDBMS vs. NoSQL: And the Winner Is . . .

SQL Server Pro
InstantDoc ID #128903

In April, I wrote a column called “Could Unstructured Data Management Technology Replace the Relational Database Someday?” I’ve included my opening and closing remarks as a quick reminder and level set for this month’s commentary.

I opened “Could Unstructured Data Management Technology Replace the Relational Database Someday?” with “I’m going to say something that will make many of you mad and wonder if I’m a complete idiot—maybe someday we won’t need relational databases. Let me be clear about a few things before going further. First, I don’t think we’re anywhere close to saying “Thanks Mr. Relational Database, but we don’t need you anymore.” Second, I’ve known the basic rules of normalization longer than some of you have been alive. Trust me, I get databases. But maybe, just maybe, technology is reaching the point in which some data management paradigms that used to be shoehorned into a relational database no longer need to be. Here’s what got me thinking these crazy and heretical thoughts.”

I closed the commentary with “I don’t think bit buckets will ever replace “real” databases. Locking, concurrency, high availability, one version of the truth, and a few semesters worth of other database topics make me believe we will always need databases. Will databases always be relational? Does it matter? I suspect many problems in the database world will always be best described using relational math and set theory. However, many database pros are at risk of having their heads in the sand when it comes to some of the newer trends involving unstructured data, including, but not limited to, SharePoint. It’s time to pay attention. Your users and customers are.”

I’m embarrassed to say that I was talking about the NoSQL movement without even knowing that the term “NoSQL” had been coined. I decided to revisit this topic because I’ve heard and learned more about the NoSQL movement. You’ll find a large amount of content that explores NoSQL. The following two articles do a nice job of summing up the main issues with respect to NoSQL: “No to SQL? Anti-database movement gains steam” and “NoSQL – the new wave against RDBMS."

NoSQL posits say that traditional relational database management systems (RDBMSs) can’t scale and are too expensive to meet the enormous performance needs of modern web-based architectures for a Web 2.0 world. Instead, “web scale” applications using NoSQL have custom, proprietary data sources or perhaps a NoSQL database engine (and I use the term “database engine” lightly) such as MongoDB. Effectively, all of these NoSQL approaches avoid joins and avoid writing to disk as much as possible. Some would argue that they aren’t even databases in the traditional sense of the word and are simply highly distributed key/value stores mostly doing their processing in memory.

Unfortunately, many of the NoSQL-related articles I’ve read recently are consistently beating the drum that traditional RBDMS engines are fundamentally non-scalable and that NoSQL is ordained as the wave of the future web processing. I disagree. The NoSQL and RDBMS camps each have compelling pros and cons and each can meet a wide variety of business needs. Neither side is 100 percent right or wrong as is true in most techno-religious debates.

NoSQL got its start with the likes of Amazon and Google. Traditional RDBMS solutions at the time didn’t meet these companies’ performance needs or would have been prohibitively expensive to implement based on licensing costs even if it was possible to build out. This led to the growth of several open-source NoSQL approaches that do indeed offer very impressive performance numbers. However, don’t expect the same breadth of features and certain basics such as guaranteed atomic transactions. But you know what, sometimes that’s OK, as several case studies referenced in the articles I previously mentioned point out.

My primary beef with the NoSQL camp is the position that RBDMSs can’t scale. That simply isn’t true for 99.99 percent or more of the data processing world. It seems like folks who suggest NoSQL or nothing are in danger of throwing out the baby with the bath water by suggesting that NoSQL is always the answer and that it’s a foregone conclusion that NoSQL will eventually stamp out the relational data model.

Are companies such as Google, Amazon, and Adobe dumb? Their market caps suggest quite the opposite and they all use high-profile NoSQL solutions for different needs. Clearly there are cases in which NoSQL solves business problems. Aside from the few cases in which petabytes of data simply won’t flow efficiently through a traditional RBDMS, I think the more compelling cases for NoSQL are cost and ease of use. Free is pretty compelling compared to potentially millions of dollars in licensing fees paid to Microsoft, Oracle, or other mainstream RBDMS providers as long as the solution meets all of your business needs.  On the ease of use side, NoSQL advocates will point to the long-held position that there tends to be an impedance mismatch between the set-based approaches of an RDBMS and the procedural style that computer programs are actually written in. I suspect that each of these dynamics will force traditional RDBMS providers to be more innovative in their offerings and price points, which is ultimately good for everyone.

Long live relational databases? Long live NoSQL? This debate is far from over, and both camps have a long and healthy product life cycle ahead of them. Expand your horizons and don’t be afraid to test your assumptions regardless of which camp you’re in today.



ARTICLE TOOLS

Comments
  • merriman
    2 years ago
    Oct 28, 2010

    To me the key is it isn't "one size fits all" any more. Many readers already use special tools for data warehousing and business intelligence. Why not add one more tool to the toolbox?

    NoSQL is not a 100% replacement for relational everytime. For example, ad hoc reporting. With relational, one designs the data model (in theory) independent of the use case. Thus any query in the future, even one not envisioned, will be at least medium easy to write. NoSQL is different. With MongoDB you design your schema for your bread-and-butter use case. This means that use case, the one that happens all day long, will be super fast. But an ad hoc unanticipated query while still doable is not quite as easy.

    SQL+relational give us a standard between clients and server. Want to connect to data warehouse with crystal reports or business objects? SQL+relational gives us a clean intermediary. But when writing custom applications that are new code, this is less a concern: we are likely using some object framework anyway. It really depends on the use case.

    Relational can scale, but with caveats. For reporting it can scale: there we have a few very expensive queries, so easy to distribute them to all nodes; we don't have many writes to the database, mostly reads. For online processing scaling is harder. We might need to do something like scale vertically. I have not seen good horizontal scaling on commodity hardware for online processing with RDBMS systems. Also a potential caveat would be that the most scalable products in that space do not have open source versions available.

    For me, it's not all about scale. Flexible schemas create great agility for rapid development. I see a lot of developers with only one or two servers who use MongoDB because they can code faster and make revisions faster.

    So don't throw everything out, just add one more tool. Pick a first project, do a prototype, see what happens. I think one will be pleasantly surprised!

    dwight/mongodb

You must log on before posting a comment.

Are you a new visitor? Register Here