Whisper.cpp - NVIDIA RTX 3090 vs Apple M1 Max 24c GPU
Excerpt
About a month ago, I finally made up my mind and purchased a used Mac Studio for about $1,200. It is the base model with M1 Max 10-core CPU, 24-core GPU, 32GB RAM, 512GB SSD and last but not least - standard 10GBASE-T with compatibility of 2.5G and 5GBASE-T.
The famed “Apple Experience” is no different on this machine. It is quiet, it is lightning fast for what I do daily, and it barely runs above 40 degrees C. I love everything about this machine, and I would probably say this is the best computer for $1,200.
What’s really interesting is that Apple has taken the Unified Memory approach to make a RAM pool which VRAM and system RAM is shared. This concept is nothing new since most of the modern CPUs that comes with an integrated graphics do use system RAM as the iGPU’s VRAM. However, no one has ever put such a powerful iGPU with their CPU, nor anyone has ever provided their iGPU with such a memory bandwidth at 400GB/s.
But, how does this compare with a “real” computer? Let’s find out.
Specs:
PC:
CPU: Ryzen 7 7700X 8-core
GPU: NVIDIA RTX 3090 24GB
RAM: 32GB DDR5-5600
SSD: Kingston KC3000 1TB
If you build it out exactly like mine, you are probably looking at $1,200.
Mac Studio:
CPU: M1 Max 10-core
GPU: 24 Core
RAM+VRAM: 32GB Integrated DDR5
SSD: Integrated 512GB + TB3 Enclosured Samsung 980 Pro 2TB
Tests:
This section will get upgraded over time, but the first entry will be to compare the performance when using Whisper.cpp to transcribe a 19-minute long podcast, with Chinese Mandarin and English spoken.
Whisper.cpp built-in benchmark
Configuration: 4 Threads, PC with CuBLAS and Mac with CoreML.
Result
Category | RTX 3090 | M1 Max 24c GPU | RTX 3090 Advantage |
---|---|---|---|
Load Time | 2081.38 ms | 1015.44 ms | - 49% |
Encode Time | 133.96 ms | 571.48 ms | + 426% |
Decode Time | 3424.47 ms | 3728.00 ms | + 8% |
Batchd Time | 980.06 ms | 2398.49 ms | + 245% |
Prompt Time | 511.76 ms | 2164.99 ms | + 423% |
Total Time | 5050.95 ms | 8863.60 ms | + 175% |
The first thing that caught my eye is how much faster the Mac Studio is able to load the model. That difference shaved 1 second of the total time required, and we will continue to see it outperforming in that category. However, that is about the only thing that the Mac can outperform the PC with a REAL GPU. The RTX 3090 obliterates the M1 Max 24c GPU in every single category that requires the raw compute of the GPU.
Whisper.cpp, 19 minutes audio transcribe, with Chinese Mandarin and English spoken.
This is more of a real world test with actual work loads to be handled.
Category | RTX 3090 | M1 Max 24c GPU | RTX 3090 Advantage |
---|---|---|---|
Load Time | 2008.59 ms | 1026.31 ms | - 49% |
Fallbacks | 0p / 1h | 3p / 23h | N/A |
Mel Time | 382.32 ms | 386.86 ms | + 1% |
Sample Time | 6628.67 ms | 6191.47 ms | - 7% |
Encode Time | 5368.99 ms | 24429.72 ms | + 455% |
Decode Time | 2616.88 ms | 751.55 ms | - 72% |
Batchd Time | 69074.67 ms | 244340.34 ms | + 353% |
Prompt Time | 1542.44 ms | 6026.75 ms | + 390% |
Total Time | 87792.72 ms | 283626.41 ms | + 323% |
As expected, when the workload runs longer, the initial load times can be irrelevant. In this real world test, the Mac Studio took almost 5 minutes to complete this 19 minutes long transcribe, while the RTX 3090 only took about one and a half. The speed difference is there, even Apple can’t bend the law of physics.
Ah, the eternal question: why choose the M1 Max when it seems to lag behind the PC in almost every test? I pondered over this a bit and realized it's all about understanding our own needs and preferences.
First up, let's chat about Windows. Oh, Windows 11, you quirky character! It's like that friend who means well but keeps tripping over their own feet. It's veered off from being the reliable workhorse we all knew, morphing into something of a billboard for Microsoft's other ventures. From its merry-go-round of updates to its dance of inconsistent UIs, Windows 11 is a bit like a variety show – you never know what you're getting next! And let's not forget those oh-so-energetic animations that seem more about flair than function. It's like a tech version of "trying too hard to be cool."
Now, let's switch gears to the Mac experience. Picture this: A serene workspace where your computer hums along quietly, almost whispering. That's the Mac Studio for you! Running tests on the PC feels like commanding a space shuttle – the roar of the fans, the RTX 3090 working overtime, guzzling power like it's going out of style. Now, compare that to the Mac Studio, which is more like a zen master, calm and composed, sipping just 80W of power. It's a stark contrast, isn't it?
So, while the PC, with its beefy CPU and GPU, flexes its muscles and shows off its power, there's something undeniably charming about the Mac Studio's quiet confidence. It's like choosing between a flashy sports car and a sleek, eco-friendly electric vehicle. Both have their allure, but in the end, it's about what brings joy and ease to your daily life.
In a nutshell, it's not always about raw power; sometimes, it's the quiet ones that make the biggest impact!