MLC
  • Home
  • Microserving LLM engines
    Jan 7, 2025
  • Achieving Efficient, Flexible, and Portable Structured Generation with XGrammar
    Nov 22, 2024
  • Optimizing and Characterizing High-Throughput Low-Latency LLM Inference in MLCEngine
    Oct 10, 2024
  • WebLLM: A High-Performance In-Browser LLM Inference Engine
    Jun 13, 2024
  • MLC-LLM: Universal LLM Deployment Engine with ML Compilation
    Jun 7, 2024
  • GPU-Accelerated LLM on a $100 Orange Pi
    Apr 20, 2024
  • Scalable Language Model Inference on Multiple NVIDIA and AMD GPUs
    Oct 19, 2023
  • Making AMD GPUs competitive for LLM inference
    Aug 9, 2023
  • Bringing Open Large Language Models to Consumer Devices
    May 22, 2023
  • Bringing Hardware Accelerated Language Models to Android Devices
    May 8, 2023
  • Bringing Hardware Accelerated Language Models to Consumer Devices
    May 1, 2023