|
Aardvark DailyThe world's longest-running online daily news and commentary publication, now in its 30th year. The opinion pieces presented here are not purported to be fact but reasonable effort is made to ensure accuracy.Content copyright © 1995 - 2025 to Bruce Simpson (aka Aardvark), the logo was kindly created for Aardvark Daily by the folks at aardvark.co.uk |
Please visit the sponsor! |
It's official, I am turning to the dark side.
Well maybe not that dark but I am about to set up my own locally hosted LLM AI model so as to get a better appreciation of the strengths, weaknesses, power and limits of artificial intelligence.
Yes, I could use the free versions of cloud-based LLMs such as ChatGPT or Gemini but anyone doing so needs to be aware of the privacy implications so I figured having my own air-gapped AI system would be a much better idea.
The big question however, is how do I build this thing?
This may surprise you but in the era of hypercomputers that require entire datacentres to support them, I'm going to kick off by using the Raspberry Pi 5 that sits quietly under my desk.
On doing some research, I was kind of gobsmacked to learn that the humble Pi has sufficient grunt to run the smaller AI LLMs that are freely available for download.
Google's Gemma4 is a great candidate for this and I've already found a number of instructional videos on YouTube with step-by-step instructions for the initial configuration.
The beauty of the Pi is that I can simply configure several SD cards with the different models I want to try and then plug-and-play as required. Each system will be totally independent of the other so when (I mean "if") I really screw something up on one, the others will remain unaffected.
The Pi is a low-cost, low energy starting point but I'll also be moving towards using one of my older i5 3rd or 4th gen Intel boxes with an old GTX1060/6 as the next logical step in the hardware progression.
I'd love to take things further and build a dedicated AI server based on a couple of RTX3090s (giving me a whopping 48GB of VRAM) but funding doesn't allow that right now.
Apparently the biggest factor in the performance of an AI model is the amount of VRAM your GPUs have and the best price-performance point right now appears to be the dual 3090 setup. These cards can be had for around NZ$1800 each -- versus $7K+ for a single RTX5090 which may be faster but has less VRAM than the 3090 pair.
Apart from experimentation, what will I use a local LLM for?
Well, based on what I've seen online, they are surprisingly competent at some of the more elementary tasks you might use AI for. They can extract core-meaning from large volumes of otherwise verbose text, they can help draft reports, proposals and other documents and they can even cut modest amounts of code in languages such as Python.
It's hard to believe just how much 4-bit quantized training data they've squeezed into the, as little as 4GB download, that these smaller LLMs require.
There are interesting parallels in place right now -- between the early days of 8-bit microcomputers and these early days of AI. Everyone's playing around with stuff that, just a few years ago, was hard to imagine. There is much to learn and a lot of fun to be had doing so, I suspect.
I'll keep readers up to date with my successes and failures as well as any lessons learned -- if you're interested.
Carpe Diem folks!
Please visit the sponsor! |
Here is a PERMANENT link to this column
Beware The Alternative Energy Scammers
The Great "Run Your Car On Water" Scam