Solid State Disks (SSDs) are devices that take advantage of the high speed nature of flash memory chips to provide fast persistent data storage. SSD’s come in a variety of forms that are suitable for different purposes. The most common forms of SSD are:

  • 2.5” disk replacements. These devices are SSD’s with the interface electronics of a standard 2.5 inch moving head disk. You commonly find them in laptops. They’re also installed in a few desktops and even a some servers. They’re made by companies such as OCZ and Intel. No special purpose drivers are needed for this type of SSD because it looks to the computer just like a fast disk. I installed a 200 GB SSD of this type when I upgraded to Windows 7. I’m very happy with it so far.
  • PCI-e boards. These devices use the same chips as other disk replacement products but put the chips on a PCI-express board. This allows the device to take advantage of the speed of the PCI-express bus to provide a higher number of I/O operations per second and more throughput than is possible when moving data through a disk interface. These devices are manufactured by companies such as Fusion-IO, OCZ, and Dolphin.
  • Rack mounted SSD devices. These devices have their own chassis and connect with the same Host Bus Adapters (HBA) that might be used to connect to a SAN. A typical manufacture is Texas Memory Systems. These systems tend to be expensive but fast replacements for SAN’s.
  • SSD enhancements to a standard SAN. Each SAN manufacturer implements SSD’s in a different way although some use 2.5” disk replacements. These give you a boost over the speed of rotating disks in the same SAN but the data still must navigate the narrow data path between the CPU and SAN.

I’ve had two chances to work with Fusion-IO boards. The first opportunity was about a year ago when Fusion-IO lent an 80 GB board to me for testing in my server. I ran a series tests on the board with SQLIO, a disk benchmarking tool, and was able to easily get it up to 450 MB/sec of throughput and 70,000 sequential I/O operations per second. Of course, the small size of each I/O needed to get the count up to 70,000 is unrealistic for SQL Server. Tests with SQL Server were also promising. With the larger I/O sizes used by SQL Server, bandwidth on the PCI-e bus became the limiting factor but I still got thousand’s of I/O’s per second.

As part of a recent client engagement I tested a 300 gigabyte board distributed by HP, but manufactured by Fusion-IO. This device is a card installed on a PCI-e Expansion Blade adjacent to the server blade. There was only one Fusion-IO board so a decision was made to test it as a location for tempdb. The expected MTBF for this board is similar to the MTBF for enterprise class rotating hard disks, around 1,000,000 hours. Device failures are still possible and the loss of tempdb was chosen as the least risky option.

The reason to use SSD cards with SQL Server is the high number of random I/O operations that can be performed with low latency and the improved throughput. Let’s look at a comparisons of LAN storage and SSD that were created using SQLIO. This first table is the performance of a 40 disk aggregate on a new NETAPP SAN. That is 40 drives in a RAID 6 configuration. You can see that for write the aggregate provide 2,337 IO/second and for read IO/s per second was 2,541. The throughput measured in megabytes per second from this test and others shows that it ranges from 280 megabytes per second up to 350 megabytes per second. From this test and others it appears that the throughput in megabytes per second is the limiting factor.

SAN Single Aggregate-40 Disks

SQLIO – Byte Size

64

128
Random Write
IOs/sec 4,443 2,337
MBs/sec 277 292
Min_Latency(ms) 0 2
Avg_Latency(ms) 3 6
Max_Latency(ms) 420 75

Random Read
   
IOs/sec 3,462 2,541
MBs/sec 216 317

Min_Latency(ms)
0 0
Avg_Latency(ms) 4 5
Max_Latency(ms) 1,050 1,085

(ms) is milliseconds

Here is the corresponding test with the Fusion-IO SSD board:

Fusion-IO Board/SSD
SQLIO – Byte Size 64 128
Random Write
IOs/sec 7,602 3,580
MBs/sec 475 447
Min_Latency(ms) 0 0
Avg_Latency(ms) 1 4
Max_Latency(ms) 74 72
Random Read    
IOs/sec 10,093 5,738
MBs/sec 630 717
Min_Latency(ms) 0 0
Avg_Latency(ms) 1 2
Max_Latency(ms) 47 50

The table shows that the SSD board is able to achieve 3,580 IO/Seconds for write and 5,738 IO/Second for read both on the 128 byte test. The throughput in megabytes/second is also much higher than for the disks at 447 for write and 717 for read. Again, the throughput seems to be the limiting factor. Another difference of the board level SSD’s is their short latency. The average latency for both read and write is less than an aggregate of 40 drives.

Cost can also be a factor in the decision to use SSD’s. The 300 GB SSD board costs around $14,000. That is a significant savings over using 40 SAN hosted drives for tempdb and there’s still a performance boost.

Theoretical I/O latency numbers may be a good indicator but real world examples should be more convincing. The following table compares the run times of several stored procedures from the development system, which has a heavy load. The procedures were chosen because they make heavy use of tempdb and thus are the best candidates for performance improvement. This should be considered anecdotal evidence because they comparisons are not based on processing the same data. However, the improvement did seem to hold up over time.


Procedure

Avg sec on Hard Disk (40 drive aggregate)

Avg sec on SSD

Improvement in seconds

Percent Improvement

DBX.update_ret_data

1218

526

692

56%

DBP.process_sec_data

1207

1035

172

14%

DBG.update_rates

1344

1062

282

21%

These examples show that it is possible to reduce the run time of procedures that make use of tempdb. Of course, since the Fusion-IO board only has tempdb it’s only when using tempdb that there will be any improvement. SQL Server uses tempdb for temp tables that that user code creates with names that begin with # or ##. It also uses tempdb to store table variables, table valued parameters, and xml variables. When needed tempdb is used for work tables for sorting, hashing, group by, distinct, snapshot isolation, and index rebuild operations.

Although still new SSDs are a hardware option with promise for significant speed improvements in SQL Server runtimes. There is also a potential for cost savings due to the ability of SSDs to provide a high number of I/O operations and throughput that would require a very large number of hard disks.

———————————————————————–

FAQ:

What type of chips are used in SSD products?
Today SSD’s are made with NAND flash chips but there are two types, MLC and SLC. Multi-Level cell (MLC) chips are the most common and they store two bits to each flash chip cell. You’ll find MLC chips in cameras, phones and most 2.5” SATA disk replacement products. They may wear out after being used for tens of thousands of erase/write cycles. Single-Level Cell (SLC) chips store only one bit per flash cell but last around one hundred thousand erase/write cycles.  The larger SSD’s boards are made with MLC chips.

I’ve heard that Flash disks wear out. How long does the Fusion-IO board last?
Although the 300 GB Fusion-IO board uses MLC chips from Samsung that’s only part of the longevity answer. SSD products don’t write the same disk “sector” onto the same flash chips each time the sector is modified. Instead it writes to the recently erased chips and keeps track of where each sector is located. Even if writing was taking place 100 percent of the time, 24×7, it would take several years to wear out the chips. Fusion-IO gave an example of a 320 GB board undergoing 5 terabytes of writes per day. It would last for almost 20 years. 5 terabytes is a realistic number. It equals a sustained rate of 60 megabytes per second of write 24 hours a day. Reads don’t wear out flash chips.

Can SSD’s be used in a SQL Server clustered environment?
SSD’s in LANs and External Rack Mounted SSD products can be used in clusters. However, the Fusion-IO board is internal to a single computer. Standard clustered configurations require that all databases, including tempdb, be on storage that can be shared between members of the cluster so the Fusion-IO board doesn’t qualify. There are non-standard cluster configurations (hacks that work but don’t call Microsoft) where tempdb doesn’t have to be shared and the Fusion-IO board could be used

About the Author

Andrew Novick is a SQL Server MVP and Principal of Novick Software where he develops business applications as a consultant using SQL Server ASP/ASP.NET, C#/VB.Net, and XML. He is the author of Transact-SQL User Defined Functions.
2010 marks his 39th year of computer programming, starting in High School with a PDP-8 and moving onto a degree in Computer Science, an MBA, and then programming mainframes, minicomputers, and for the last 20 years, PCs. When not programming, he enjoys coaching Little League baseball, woodworking, mowing the lawn, and the occasional movie with his wife. He can be reached at [email protected].