All Posts

Published on
9. März 2026
ONNX Export: 3-5x schnellere KI-Inferenz
onnx inferenz modelloptimierung deployment performance deutschland
PyTorch-Modelle als ONNX exportieren: 3-5x schnellere Inferenz, 60 % weniger GPU-Kosten. BERT antwortet in 12 ms statt 45 ms.
Published on
9. März 2026
KI-Schweißnahtverfolgung: 94 % weniger Nacharbeit
schweissnaht verfolgung robotik fertigung mittelstand deutschland
KI-Nahtverfolgung korrigiert den Schweißroboter in Echtzeit per Kamera. 94 % weniger Nacharbeit, ISO 5817 Gruppe B. Nachrüstung ab 12.000 €.
Published on
9. März 2026
Ollama Ubuntu installieren: LLM lokal 15 Min
ollama ubuntu self-hosted llm installation deutschland
Ollama auf Ubuntu installieren: Lokales LLM in 15 Minuten. Llama 3.1 auf eigenem Server, €0 API-Kosten, volle DSGVO-Kontrolle.
Published on
9. März 2026
Ollama Cluster: Load Balancing für 200+ Nutzer
ollama cluster load-balancing self-hosted mittelstand deutschland
Ollama Cluster mit Load Balancing: 200+ Nutzer, automatisches Failover, horizontale Skalierung. Nginx-Setup für den Mittelstand.
Published on
9. März 2026
Ollama GPU CUDA Setup: Ubuntu Server Anleitung
ollama gpu cuda ubuntu self-hosted mittelstand deutschland
Ollama mit NVIDIA GPU und CUDA auf Ubuntu: 8x schneller als CPU. Anleitung für CUDA-Treiber, VRAM-Optimierung und Produktion.

ONNX Export: 3-5x schnellere KI-Inferenz