Can a Local LLM REALLY be your daily coder? Framework Desktop with GLM 4.5 Air and Qwen 3 Coder
With the arrival of my new Framework Desktop I decided to move to coding just with Local LLM's without touching any Claude, GPT5, etc models. I learned a lot while running GLM 4.5 Air, Qwen 3 Coder, GPT OSS 120b, and ultimately I think I landed in a good spot. Links: π§βπ»My Recommended AI Engineer course is Scrimba: https://scrimba.com/the-ai-engineer-path-c02v?via=GosuCoder My Links π ππ» Subscribe: https://www.youtube.com/@GosuCoder ππ» Twitter/X: https://x.com/GosuCoder ππ» LinkedIn: https://www.linkedin.com/in/adamwilliamlarson/ ππ» Discord: https://discord.gg/YGS4AJ2MxA My computer specs GPU: RTX 5090 (sometimes a AMD 7900xtx) CPU: 7800x3d RAM: DDR5 6000Mhz Media/Sponsorship Inquiries β gosucoderyt@gmail.com
Video Chapters
- 0:00 Kickstarting the Local LLM Revolution: Say Goodbye to Cloud Costs!
- 0:30 Unveiling the Powerhouse: The Framework Desktop's AI Muscle
- 1:45 Putting Models to the Test: Real-World Benchmarks Revealed
- 3:20 Optimizing for Peak Performance: Essential Settings for Speed & Memory
- 6:30 The Early Struggles: Battling Repetitive Code Generation
- 7:55 Facing the Frustration: Long Prompts & Persistent Timeouts
- 9:45 A Game-Changer Arrives: Solving Timeouts with Crush
- 11:25 Exploring New Horizons: Jan AI's Creative UI Insights
- 15:15 The Ultimate Strategy: Smart Models & Swift Agents Working Together
- 16:00 Finding the Perfect Balance: Matching Model Speed to Task Size
Original Output
0:00 Kickstarting the Local LLM Revolution: Say Goodbye to Cloud Costs! 0:30 Unveiling the Powerhouse: The Framework Desktop's AI Muscle 1:45 Putting Models to the Test: Real-World Benchmarks Revealed 3:20 Optimizing for Peak Performance: Essential Settings for Speed & Memory 6:30 The Early Struggles: Battling Repetitive Code Generation 7:55 Facing the Frustration: Long Prompts & Persistent Timeouts 9:45 A Game-Changer Arrives: Solving Timeouts with Crush 11:25 Exploring New Horizons: Jan AI's Creative UI Insights 15:15 The Ultimate Strategy: Smart Models & Swift Agents Working Together 16:00 Finding the Perfect Balance: Matching Model Speed to Task Size Timestamps by StampBot π€
Unprocessed Timestamp Content
0:00 Starting the local LLM coding journey; no more cloud dollars. 0:30 Meet the Framework Desktop: 96GB VRAM for serious AI tasks. 1:45 Benchmarking local models: Qwen 3 Coder and others on display. 3:20 Crucial settings for speed and memory: batch size and quantization types. 5:30 AMD runtime challenges: ROCm versus Vulkan, a memory adventure. 6:30 Initial coding pain: GPT OSS 120B's repetitive code generation blues. 7:55 Prompt processing struggles: long waits and frustrating timeouts. 9:45 Discovering Crush: the timeout-free oasis for persistent coding tasks. 11:25 Back to basics with Jan AI: getting smarter UI design ideas. 13:59 Stress-testing LLM models: a Python script reveals performance insights. 14:15 Framework Desktop in action: GPU working hard, yet surprisingly quiet. 15:15 The grand theory: combining slow, smart models with faster worker agents. 16:00 The sweet spot: smart, slower models for big tasks, fast for small. Timestamps by StampBot π€