Content-type: text/html Downes.ca ~ Stephen's Web ~ From bare metal to a 70B model: infrastructure set-up and scripts

Stephen Downes

Knowledge, Learning, Community

I really do love 'do it yourself' posts and have an entire YouTube playlist devoted to my adventures with them. This set of instructions, however, begins with the requirement for "one cluster that had 4,088 H100 GPUs spread across 511 computers, with eight GPUs to a computer." Um, that's a bit much for my home office, let alone my budget. Then you have to make sure every computer works (they don't always), set up the software, train a single node, "burn InfiniBand burn," make sure again that the machines are working, and more. I like the 'reflections and learnings' at the end and look forward to the day when home hobbiests can set up their own 70B models in their living rooms.

Today: 2 Total: 105 [Direct link] [Share]


Stephen Downes Stephen Downes, Casselman, Canada
stephen@downes.ca

Copyright 2024
Last Updated: Nov 21, 2024 11:28 a.m.

Canadian Flag Creative Commons License.

Force:yes