Futuremark: Has it lost its focus?
The world of graphics cards has been very exciting to follow over the past couple of years. Nowadays the competition between the two giants (ATi and Nvidia) is at an all-time high. It almost seems like new products are announced all the time, which is certainly good for both computer enthusiasts and simple computer users as it gives us more power in our hands. Things have not been that ideal though. The past months have seen allegations of cheating, cheating and more cheating from Nvidia’s part in order to regain the lost (and highly coveted) No.1 spot in the performance arena. At the centre of all that is Futuremark; a Finnish company which produces one of the most controversial benchmarks of our times, 3DMark03.
A small historic background
In case some of you don’t know what has happened, a small historic background: The start was made by the now infamous Extremetech article
which exposed Nvidia’s and ATi’s “optimizations” to help their cards improve scores in the 3DMark03 benchmark. But whereas ATi was gaining a maximum of 2% in the final 3DMark03 score, Nvidia’s gain was a whooping 24%! From that point on, things really got interesting. ATi responded immediately, explaining the nature of the optimization it made, and removed them from the next version of the Catalyst driver suite. Nvidia’s reaction was totally different; it denied everything and actually accused Futuremark for trying to make its products look bad. To make a long story short, amidst new discoveries of Nvidia’s questionable “optimizations” in other games as well, it seems that Nvidia and Futuremark have put their differences in the past, and now Nvidia is back on board as a Futuremark beta member, thus endorsing the product once again. Following talks with the Santa-Clara based graphics company, Futuremark made a statement that it is investigating whether some optimizations are valid after all and that it doesn’t believe what Nvidia did was cheating but rather “application specific optimizations”.
Where do we stand now?
After watching all these developments that take place, it is my firm belief that Futuremark has lost its focus as a company. Its latest actions firmly establish that in my mind. To better understand that we must stop for a minute and think about some things that are often overlooked. What is a benchmark? A benchmark is "...a standard by which something can be measured or judged" according to Dictionary.com
. So benchmarking is "...to measure (a rival's product) according to specified standards in order to compare it with and improve one's own product" according to the same source. Now that's what 3DMark was once all about. It was a standard, by which we measured our graphics cards performance in order to get an indication of their speed and feature-completeness. 3DMark03 was initially in my eyes a step in the right direction; whereas 3DMark 2001SE gave you a system performance indication, 3DMark03 is heavily (if not completely) dependant on your graphics card performance at this point in time. Hence the various reviews we have seen around the web, where a simple low-end AthlonXP could easily compete with high-end models.
Then the whole Nvidia fiasco took place... and now, after Nvidia officially rejoined the Futuremark beta program, the company seems to want to discuss where shader replacements are "acceptable". I quote Patric Ojala, who is one of the most well-known Futuremark employees that is active in the community, in a statement he made in the Beyond3D forums
“Further discussions have shown that under some circumstances (that we're trying to define at the moment) shader replacements can be accepted. ”
Here’s where the problem lies essentially. Why don’t we stop for a second and remember one more time the definition of what constitutes a benchmark. It is a standard by which others are measured or judged. When a graphics card driver replaces one or more shaders that the benchmark uses with a similar one, even if the output is exactly the same, the benchmark stops being a standard. It stops being the constant by which others are measured, and consequently it stops being a benchmark!
I will elaborate on this by making an analogy; an example that I experienced this past year. As an MSc student in the University of Reading, I had to optimize a piece of C++ code for the Pentium processor pipelines. So what I had to do was take the compiled code, disassemble it (which means convert the binary code into assembly code) and then use Intel's VTune software (which is a tool that software engineers use to measure code performance, find culprits and bottlenecks in the code and thus optimize it to be more efficient) to make it run faster. Initially the small program needed approximately 2000 CPU cycles to run. After I was done with it, it needed only 29! Do the math and you will see the kind of performance increase we are talking about. Don’t forget, the output was exactly the same, the job it did was exactly the same, but I was making 100% use of the Pentium processor features to make it run faster by shuffling commands, substituting some commands with others that do the same thing, albeit in shorter time and even remove some commands. If I wanted to optimize it for the Pentium4 or an AthlonXP processor, I would have to follow different steps to take advantage of the particular platform and architecture, and so on and so forth.
In this example, the small piece of code is your "benchmark". If you run it on a Pentium processor and on a Pentium4 processor as it is, you will notice the obvious difference between the performance and speed of the two CPUs. But if you code a detection routine that will detect whether a Pentium processor or a Pentium4 processor is being utilized, and according to its findings run two different versions of the program, each heavily optimized for each processor, you will NOT be able to compare the two processors. Why? Because the code is not constant anymore. And you need it to be standard in order to be able to draw useful conclusions. It’s like having a moving target: you will never able to hit it because it keeps on changing.
Where do we go from here?
I am afraid that with its latest actions and statements, 3DMark has lost its place in the industry. The officially allowed shader replacements will invalidate it as a benchmark (as they largely have already) and Futuremark will have hammered the final nail on its product’s coffin. Whichever side you're on, that is not good for the industry. We need means to compare our graphics cards. You see, having a program that shows us the real potential of our cards is one thing, but in order to compare one card to another and make an informed buying decision, the playing ground must be equal. Card X may have one gazillion new features and look attractive but what’s the good in it if they are never utilized? Or if when utilized performance drops to a crawl? Support for future games is a very catchy line; but supporting something is a whole different ballgame than actually being able to run it adequately.
We need benchmarks. That is a certain fact in my eyes. Having game-related benchmarks is simply not enough; hardware evolves so rapidly that software is always playing catch-up; and that is an undisputed fact since the era of computing started. We never buy a graphics card only to run today’s games. We make an investment; and we want this investment to last for as long as possible. Benchmarking today’s games can NEVER give us that; we have DirectX 9 certified cards for over a year and so far, the DirectX 9 titles can be counted in the palm of one hand. So benchmarks have their place; but we need them to be the place where all cards are treated equal. No advantages. And under the light of the recent developments, I am not sure that 3DMark can be that benchmark anymore. Our alternatives? I simply don’t know; it seems that we are at crossroads. I guess time will tell.
Just my two eurocents...
The views and opinions expressed within this article may or may not represent the views and opinions of the staff at Rage3D.com