自苹果21年发布m1芯片以来(Mac mini m1使用简单体验(编程、游戏、深度学习)已经过去了3年多,这些年MAC跑LLM已经很成熟了,凭借着ollama和LM Studio还有其他一些各种工具的支持,mac跑大模型门槛有手就行。
而且内存给上去,比如128g的macbook,跑70B的llama还是有点使用场景。
AI PC我理解就是个人电脑可以通过电脑硬件自带的算力跑起AI大模型,不需要联网云端,而且更进一步可以和系统集成,通过AI提升工作效率。不过要跑起来重要的还是要有算力,mac的算力虽然没有老黄的显卡算力高,但是胜在有统一内存(内存比显存便宜),在某些方面带宽要高(需要多卡场景的70B模型,而mac不需要),两者的性能差距也不会太大:
from https://www.hardware-corner.net/guides/mac-for-large-language-models/
除了mac,我们看看今年出了哪些更牛逼的AI PC,。
Project DIGITS
MAC killer ?和mac很像,Project DIGITS有大的统一内存,算力有老黄加成肯定不差。
The scatter plot above illustrates the relationship between bandwidth (GB/s), token generation (TG in tokens per second), and the number of GPU cores for different Apple silicone chips
ROG Flow Z13 can be configured with the brand new AMD Ryzen AI Max+ 395 and Radeon 8060S Graphics from AMD. This processor can deliver 50 NPU TOPS (trillion operations per second) performance. It is a certified Copilot PC that offers built-in AI features and tools. It packs up to 128GB of LPDDR5X 8000MHz RAM and up to 1TB of storage. It can allocate up to 96GB of available RAM for the GPU. It is claimed to be capable of running a 70B large language model locally.
然后号称可以运行70B的llama,比单卡4090跑的还快,我们看看怎么比的:
Testing as of Dec 2024 using Llama 70b 3.1 Nemotron Q4 KM quantization running through llama.cpp and LM Studio. Input prompt length 100 token prompt.
System configuration for Ryzen Al Max+ 395: AMD reference board, 55W TDP, Radeon 8060S graphics, 128GB RAM(32GB for the CPU, 96GB allocated to the GPU) , 1TB SSD, using Llama 3.1.
Configuration for Nvidia RTX 4090: ASUS ProArt X670E-CREATOR WIFI motherboard, AMD Ryzen 9 7900X processor, 32GB system RAM, 40GB GPU memory, 1TB SSD, Windows 11.