Last week, my business partner sent out a message to all the partners that included the following comments, which I’ve slightly edited to remove specific comments about our company:
“Over the past couple of years, I have been closely watching some technology changes that are occurring in the storage area. The technology I am speaking of is that of SSD—Solid State Drives. Over the past year, I have been teaching about their imminent arrivals within the IT community. Earlier this year, significant advances in the technology have hastened that arrival.
“As you may know, Apple with its 'Air' laptop recently began shipping a SSD as its primary hard drive. Quickly following that was Dell and several others. In addition, there have been several companies that have released server-class storage products based on SSD technology. The performance increases between legacy spindles and SSD are phenomenal. For example, seek times for a top-end SCSI or SAS drive is 3-6 milliseconds. For SSD it is 15 microseconds. IOPS (I/Os Per Second) for legacy drives are measured in hundreds per spindle. For SSD it is measured in hundreds of thousands.
“Those types of performance numbers at the storage level can do miraculous things to cover up extremely poor database design and inefficiency. This type of technology widely implemented may render many (or most) current tuning problems a moot point.
“In the past, the costs of these systems have precluded them from widespread use but that is changing, and by next year I believe we may begin encountering server-based SSD installations; if not in 2009, definitely by 2010.”
So what does SSD mean for the glorious profession of database tuning? If developers can write any code, with any (or no) indexes, and still have it perform well, why would tuning matter? I'm sure there are purists among us who would still want to get the best code and best indexes they could. But others, who aren't so well versed in monitoring their SQL Servers and examining their query execution plans, might not even realize there are improvements that can be made, if no one is complaining about slow-running queries.
There's still the aspect of concurrency and blocking problems that database tuners might have to busy themselves with. As I tell my students, “You can have the fastest running query in the world, but if it can’t get to the data it needs because the data is locked exclusively by another process, it doesn’t matter how fast your query is when run in isolation. It will appear that the query is slow." I also think that database design will start to become a bigger issue. Most of my clients just want the query to run fast, and now. No one wants to even think about completely redesigning their database. But poor database design can not only lead to slow queries, which we might not be worrying about anymore, but also more overhead to make sure duplicate data is managed appropriately and your queries are written correctly. Database design is a seriously overlooked topic, and maybe if there’s less need to worry about query speed there will be more time to evaluate your design. That’s not a bad thing at all!
So what do you think? How will your tuning needs change if your queries start running a thousand times faster? Will you worry about good indexes and execution plans at all?
If you don't have a username & password, please
register now.
Reader Comments
Just my opinion but solid state hardware allows for much more automated code and database generation. Instead of spending as much time hammering out your data structures you'll use a tool like an O/R mapper to design your business logic and leave the db to the tool. It may not be the most efficient structures but the minimal gains in performance that will be available won't, except in exceptional situations, be worth the cost. Not only that but it changes how you interact with the files system since the database isn't the only thing on disk.
BillS- September 04, 2008
Article Rating 4 out of 5
Even when query data ends up in cache, which is an even faster context of query execution than acessing data from SSD, poor query design hurts. It always shows with big CPU spikes. So we are not near needing not to optimize queries. About database design, our experience proves that a poor design ends up in writing more sql code and more complex sql code. It just makes application maintenance and evolution more and more complicated with the time. I agree that SSD will forgive more sloopy SQL, so less optimization work will be needed specially with smaller tables.
pelsql- September 05, 2008
Article Rating 4 out of 5
just thought I'd mention there's been solid state drives for decades, they've been used on "big iron" IBM mainframes, etc, as very fast drives for things like - tada - databases :).
but - to echo some of your other comments - just cuz you've removed alot of the storage access time doesn't mean poor query design won't still be uber-slow - a full table join on a multiple-million row table without an index is gonna hurt no matter what you do.
cwbutler- September 05, 2008
Article Rating 3 out of 5
I just did a similar piece in TDAN on the impact of SSD on databases. But there is another trend that is also going to hit us -- multi-core processor chips. That means I can assign a dedicated processor to a chunk of SSD with off-the-shelf hardware. Processing is about to become massively parallel. Indexing does not work so well in that environment. All the proprietary auto-numbering for keys was never relational, but it will soon be dangerous because it is inherently not parallel. We are already seeing huge numbers of assorted data users, feeding and using database via the Internet on all sorts of hardware; it is going to get worse.
Hashing will work in this environment and hashing depends on actual relational keys, lack of redundancy and all that jazz Dr. Codd told us about. Those of us who have been strict about following the rules and doing solid relational schema designs are going to be the only survivors. The last vestiges of sequential magnetic tape file processing are about to learn how evolution works.
jcelko- September 05, 2008
Article Rating 5 out of 5
Kalen,
I have read your articles for years and have great respect for your insights. That being said I would like to remind you of something I learned early on in my career "The Triangle of Performance".
To summarize it says that when tuning you must consider the relationship between the three sides of the triangle which are CPU, I/O, and MEMORY. There is always a bottleneck that is occurring in one of the sides. As you beef up (in case of I/O speed up) one side of the triangle you push the bottleneck onto the other sides of the triangle, until one becomes the new bottleneck. However there is never a way to totally remove yourself from one of them becoming the bottleneck that destroys performance at some level.
There has been little tiny increases on the I/o side over the years when it comes to DB servers. Things like SCSI, Raid, SANs, data spread across multiple servers, and Partitioning. Each one has offered a gain though not at the level that Solid state could offer. So for most shops the I/O has been the bottleneck. Just because that will now be changing in a substantial way doesn't mean the triangle is broken.
Tuning will always matter when the companies profits are tied to the amount of transactions that occur on the server, and scalability in terms of DB size and # of users is the goal. My example would be Amazon I go on there and search for a book add it to a cart and checkout. If any of those steps took more than 5 minutes will I come back? Not bloody likely. If I can get speed by having a proper data model and access that to me justifies the cost as much as purchasing more and more iron to throw at the problem.
So now we can embrace faster I/o believe me I'm all for it. Maybe now we can get away with a correlated sub query, or a cross join on 2 1TB tables, or worst of all reporting and a OLTP server. However believe me when I say one way or the other when that server tanks at 4 AM the DBA will get that call.
Tony DBA
mudluck- September 05, 2008
Article Rating 5 out of 5
I think that there's a flip side that is not addressed here. If query times are 1000 times faster then the times that queries are causing locks will be 1/1000th the current time. I imagine that will significantly reduce blocking issues in all but the most poorly written SQL or badly designed systems. Hardware improvements have always been used as a crutch for bad design from development. And this will just continue the trend.
garrettp1- September 05, 2008
Article Rating 2 out of 5
Given the performance enhancements described above, I would still want to pay equal attention to the proper structure and design of the database. While the time it takes to run large queries will dramatically improve there are still issues, primarily memory usage that will still need proper attention.
thomasam- September 05, 2008
Article Rating 5 out of 5
I would recommend readers to take a look at an old paper from 1994: "Cache Conscious Algorithms for Relational Query Processing" http://www.vldb.org/conf/1994/P510.PDF Back then also new hardware was creeping in (huge caches) and databases were questioning how to leverage it. The new hardware was faster, but when new algorithms specifically designed for it came up, they blew every previous benchmark apart. So my point is that it doesn't matter if new, faster iops come up. If there are faster algorithms to leverage the higher iops, they will prevail. Today's indexing techniques can still leverage these new drives better than flat out heaps. For a while they will still be used, but perhaps they will be revisited and newer storage techniques will appear that even better fit the SDD. I don't think that databases will just settle for 'is fast enough'. If it can go faster , they will always drive for it. Market will push for it :) Luckily the database market is not 99% dominated by one vendor like other software markets so it cannot affords to stagnate...
If today's application can be ported to a 100x faster iops and dispose those indexes, tomorrow a new application will appear that will require the 100xiops *and* fast, efficient data access. Whether is a larger numerical simulation, or a real-time BI analysis, who knows.
remusrusanu- September 08, 2008
Article Rating 4 out of 5
Didn't this sort of phenomenal change happen in the non-database world not so long ago, when memory was really expensive and 16MB was thought to be huge. Now all we do is shove as much memory in the box as we can and not worry about performance tuning our code - our .Net developers don't evan know what performance tuning is...perhaps it will be the same for database developers in a few years.
DBAFlash- September 08, 2008
Article Rating 5 out of 5
Exactly. If a "new storage technology ... appear[s] that could have I/O's per second (IOPS) rates thousand of times faster than they are now", databases will simply evolve to consume those resources.
Looking forward to maintaining that 5 YottaByte database...
Larry Leonard- September 11, 2008
Article Rating 3 out of 5
Never under-estimate the fact that the SQL optimizer can and sometimes does choose a horrible execution plan, it will take an otherwise well performing query and make it take exponentially longer to execute. The query plan could be using up unnecessary io reads or spinning cpu. In the face of this, faster disk io will be of little consequence. SQL Server has the ability to take even the fastest hardware available and turn it into a puddle of molten metal with nothing other than an inefficient query plan.
Yes, with SSD, for 90% of the database world run in small to medium size shops with small to medium size databases, any performance problems will all but disappear. SQL Server is virtually self-tuning in these small environments and most of these little shops have no DBA for this reason and can't understand why they would ever need one. The other 10% know the real value of seasoned DBA who knows how to really "tune" a db and "that" will never be replaced by faster hardware.