TL;DR: Apple dusted off an old AI trick—Normalizing Flows—and spiced it up with Transformers to create two new image generators: TarFlow and STARFlow. Unlike diffusion or token-based autoregressive models, flows learn a reversible “noise ↔ image” mapping that gives exact likelihoods. TarFlow chops images into patches and predicts pixel values in sequence (no token compression!), while STARFlow works in a smaller latent space before upsampling to high-res, and even plugs in lightweight language models for text prompts.
The big sell? These flow-based models could run on your device, offering crisp detail and probability-aware outputs without constant cloud crunching. It’s a different path than OpenAI’s GPT-4o, which treats images like giant token streams in the cloud—flexible but heavy and potentially slower—whereas Apple’s approach is built for speed and efficiency in our pockets.
Top comments (0)