AI MODEL RUNS ON 12.5% OF EXPERTS WITH MINIMAL LOSS

AI DESK■ 2 MIN READ

SAT, MAY 16, 2026

■ AI-SUMMARIZED FROM 1 SOURCE ▸ TIMELINE

Researchers at the Allen Institute for AI and UC Berkeley have developed EMO, a mixture-of-experts model that maintains near-full performance while using just one-eighth of its experts. The breakthrough could make advanced AI systems practical for memory-constrained devices.

A new approach to training mixture-of-experts (MoE) models shows that AI systems can achieve near-full performance while running on a fraction of their computational components. Traditional MoE architectures organize experts by word types or linguistic features. The new EMO model instead groups experts by content domains, enabling researchers to remove 75 percent of the experts while sacrificing only about one percentage point of performance. This efficiency gain addresses a critical limitation of current MoE models: their memory demands make deployment difficult in resource-constrained environments. By reducing the active expert count dramatically, EMO opens possibilities for running these models on devices with limited RAM and storage. The domain-based specialization appears to create cleaner separation between expert functions than traditional approaches. This structure allows for more aggressive pruning without cascading performance degradation. Researchers can identify and remove experts that handle less common or overlapping content areas. The one-percentage-point performance drop represents a practical trade-off. For many applications, particularly those not requiring maximum accuracy, the efficiency gains could outweigh this modest performance cost. The work carries implications beyond just smaller devices. Faster inference speeds from reduced expert activation could cut operational costs for large-scale AI services. Lower memory requirements could also expand the scope of edge deployment scenarios where full models currently prove impractical. Mixture-of-experts has emerged as a key scaling strategy for large language models, with major implementations from Meta, Google, and others. However, scaling benefits come with memory and latency penalties that have limited real-world adoption. Solutions that make MoE models more efficient address a genuine infrastructure challenge. The research demonstrates that architectural choices fundamentally shape how AI models can be optimized. Organizing computation by content domain rather than linguistic patterns produces systems that compress more effectively. This insight could influence how future large models structure their expert components. Further work will likely explore how this approach scales to even larger models and whether similar domain-based specialization benefits other AI architectures.

■ SOURCES

► The Decoder

■ SUMMARY WRITTEN BY AI FROM THE LINKS ABOVE

■ MORE FROM THE AI DESK

P167AI SPENDING HASN'T HIT OVERSPEND YET: BLACKROCK

BlackRock's international equities chief says the AI capital expenditure cycle remains in healthy territory, with significant investment expected over the coming years before the market reaches typical overspend levels.

2H AGO— AI Desk

P162CAN INDIA BUILD FRONTIER AI WITHOUT CHIP DOMINANCE?

India possesses abundant talent and data but lacks critical infrastructure to compete in advanced AI development. The question of whether to invest in frontier models despite hardware and funding constraints is reshaping the country's AI strategy.

2H AGO— AI Desk

P152RECRUITERS PIVOT TO AI JOBS AS AUTOMATION THREATENS

Recruitment firms are shifting strategy to focus on specialized AI roles as artificial intelligence tools increasingly replace traditional hiring processes and human recruiters.

6H AGO— AI Desk

P151ESTONIA BUILDS AI TO CATCH LEGAL ERRORS BEFORE THEY COST MILLIONS

A single wording mistake in Estonian legislation cost the government $28 million. The incident prompted Estonia to develop an AI system designed to catch legal errors before laws take effect.

6H AGO— AI Desk

◄ BACK TO NEWS

AI MODEL RUNS ON 12.5% OF EXPERTS WITH MINIMAL LOSS

■ MORE FROM THE AI DESK

■ SUBSCRIBE TO THE DAILY BRIEF