Here’s an interesting question I was sent by my friend Steve Jones over at SQL Server Central – will a single CPU with dual-cores perform better than two single-core CPUs? Both have two processing cores but the hardware architecture is different – which one will make SQL Server perform better? Well, there’s no hard and fast answer – it depends! I had a discussion on this topic this morning with Jerome Halmans, part of my old team in the SQL Server Storage Engine and I’m basing this post on our discussion with his permission. My hypothesis (which Jerome confirmed) was that the performance of the two architectures depends on the amount of cache line invalidations and how that is managed (see here for a description of CPU caches and cache lines). Here is a very interesting and accessible Intel article that discusses cache-sharing in multi-core Intel systems. In this paper at least, the L2 cache is shared but modifications made by different cores in their private L1 caches still need to bounce through the shared L2 cache before being loaded by the other core. This will still be WAY faster than having to go through main bus between single-core CPUs. And here is a similar paper from AMD on their Barcelona multi-core architecture that describes each core having separate L1 and L2 caches, with an additional shared L3 cache. The seperate L2 caches are kind-of linked though, in that modifications to a cache line in one L2 cache are immediately mirrored in the other L2 caches (if needed). But the amount of cache invalidations (of whatever kind) depends on the workload. The two types of workload to consider are: Saying that, the majority of workloads on SQL Server are of the second type above. Jerome mentioned that even synthetic workloads (such as the TPCC benchmark) are still going to result in multiple-threads accessing and changing the same data/index pages. So – what’s the conclusion? I expect that a multi-core CPU will outperform an equivalent number of single-core CPUs in most workloads. And as Jerome pointed out, even if that’s not the case for your workload, you’ll find it pretty hard to find a system that ships with single-core CPUs these days. I’d love to hear any comments on this, especially any measurements you’ve done on workloads as I don’t have any single-core machines available to run tests on – even the laptop I’m typing this on is a dual-core Centrino.
2024: the year in books
Back in 2009 I started posting a summary at the end of the year of what I read during the year and people have been
5 thoughts on “Search Engine Q&A #5: Do multi-core CPUs perform better than single-core CPUs?”
I certainly won’t argue with your reasoning. I will suggest that another factor in deciding whether workload needs to cross the CPU boundary is related to the ccNUMA architecture, which Windows and SQL Server support. I’ve performed our large scale performance & scalability testing on both Intel and AMD based CPU architectures and noticed a significant difference in CPU load distribution. AMD CPUs support ccNUMA – and SQL Sever distributes load very differently on these CPUs than it does on the Intel CPUs. I haven’t tested the latest Intels though (my testing experience in this area is 2 years old). At the end of the day, both CPUs gave us the same performance result.
We tested on the HP DL580 (Intel) and 585 (AMD), both 4 socket, dual core servers. It was interesting to notice that SQL Server kept 2 sockets very busy on the 585 – redirecting queries to the same sockets where they started thus allowing them to access their local cache data. The DL585 has several numa nodes, where-as the 580 only had 1. So on the Intel server load was evenly distributed across all 4 processors. On the AMD the workloads favored 2 CPUs. I would guess, based upon your blog posting, that there were many cross CPU accesses on the Intel, and fewer on the AMD.
Again – both gave more than adaquate performance. However the profile was different. Each had a different approach, but got the job done.
Very interesting – thanks for commenting.
Hi Paul,
Just to drop a line, you may find interesting this series of insights from Ulrich Drepper (from rh) about memory mechanisms.
http://lwn.net/Articles/250967/
CPU caches detailed in section 2.
Awesome – thanks David.
why u no benchmark?