⬔ ā–ˆ

Apfel & Apple Local Inference

APFEL & Apple Local Inference

Overview

APFEL (Apple Foundation Ecosystem Layer) is an open-source CLI tool and local server that provides a bridge to the built-in Large Language Models (LLMs) powering Apple Intelligence on macOS. Developed by Arthur-Ficial, it allows developers and power users to interact with Apple’s system-level models via the command line or an OpenAI-compatible API without needing to manage model weights or complex local environments like Ollama.

The ā€œApple Foundationsā€ Framework

At its core, APFEL leverages the FoundationModels framework, a native Swift API introduced in macOS 15.1 (Sequoia) (often confused with the 2026 SDK 26.4). This framework is the foundation of Apple Intelligence and provides direct access to the Apple Foundation Model (AFM).

Key Technical Specs

  • Model (AFM): Approximately 3 billion parameters (Dense Transformer).
  • Quantization: Proprietary 2-bit weight quantization optimized for the Apple Neural Engine (ANE).
  • Context Window: 4,096 tokens (Hardcoded architectural limit).
  • Primary Class: FoundationModels.SystemLanguageModel.
  • Restriction: Signed-Binaries only. The framework does not currently allow loading custom weights or third-party models.

APFEL Architecture & Features

APFEL is structured to be both a tool and a library, isolating the core logic into a reusable Swift Package.

1. ApfelCore

The underlying library that wraps the FoundationModels framework. It handles:

  • Inference: Using SystemLanguageModel for text generation and transformation.
  • Transcript API: Managing conversation history within the context window.
  • Native Tokenization: Cites Sources/ApfelCore/TokenCounter.swift:
    let model = try await SystemLanguageModel.load()
    return try await model.tokenCount(for: text)
  • Schema Conversion: Translating JSON tool schemas into native ToolDefinition objects for function calling.

2. Delivery Mechanisms

  • CLI: Supports piping, file attachments, and an interactive REPL mode.
  • OpenAI-Compatible Server: Runs a local backend (defaulting to port 11434).
    • PCC Safety: APFEL typically requests routing: .preferOnDevice to avoid triggering Private Cloud Compute (PCC), ensuring 100% local inference.

Comparison: Apple Foundations (APFEL) vs. MLX

STRESS TEST RESULTS (M3 Max):

FeatureApple Foundations (APFEL)MLX (Llama-3-3B)
Inference EngineNeural Engine (ANE)GPU (Metal)
TTFT~18ms~45ms
Throughput42-50 TPS35-40 TPS
Power EfficiencyHigh (Cool/Silent)Moderate (Fan Spin)
FlexibilitySystem-onlyAny Open Source

Gardener’s Summary

APFEL is a pivotal tool for the ā€œlocal-firstā€ AI paradigm on macOS. While it lacks the flexibility of MLX for custom model research, its latency and battery efficiency make it the superior choice for production agents and ā€œalways-onā€ background tasks. It serves as the bridge between Apple’s high-efficiency hardware and the open-source developer ecosystem.

Sources

  • [[Learning Path - ML Development]]
  • [[Next-Gen AI Memory Architectures]]
  • [[Retrieval-Augmented Generation (RAG)]]