enni.

எண்ணி எண்ணி · to think

We build software that has to think.

Enni Technologies is an AI-native company. We build and research applied intelligence — LLM systems, agents, and on-device AI — designed to hold up in production, not just demo well.

Start a conversation → See what we build

Focus

Applied AI

Systems

LLMs · Agents · On-device

Method

Eval-driven

What we build

Intelligence, applied — and held to a standard you can measure.

We take a small number of hard problems and go deep, building systems that stay reliable once real users and real data arrive.

Applied LLM systems

Retrieval, knowledge assistants, and structured generation that stay grounded in your data — measured, bounded, and cost-aware rather than impressive once and unreliable after.

RAG & retrieval architecture
Grounded assistants & copilots
Structured, schema-safe generation

RAGVector searchGroundingStructured outputCost governance

Agents & autonomous workflows

Agentic systems that take real actions inside real constraints — bounded loops, tool use, and evaluation built in from the first line, so autonomy never means unpredictability.

Tool-using agents & orchestration
Evals, guardrails & observability
Human-in-the-loop where it matters

AgentsTool useEvalsGuardrailsOrchestration

On-device & private AI

Models that run where the data lives — fast, private, and offline-capable. For products where latency, cost, or confidentiality rule out a round trip to someone else's cloud.

Local & edge inference
Quantization & model selection
Privacy-preserving architecture

On-deviceEdge inferenceQuantizationPrivacy

AI strategy & architecture

The foundational decisions, made well — model strategy, evaluation design, and system architecture so a serious AI build rests on the right footing from the start.

AI system designEval strategyModel selectionArchitecture

Research

We don't just apply intelligence — we investigate it.

Research isn't a separate department here. The open questions we work on feed directly into what we build, and what we ship sends us back to the questions.

Efficient inference

Running capable models inside tight latency, cost, and privacy budgets — on device and at the edge.

Retrieval & grounding

Keeping generated output anchored to verifiable sources, and knowing when a system should say "I don't know."

Agent reliability

Making autonomous systems measurable and predictable — evaluation, failure modes, and bounded behavior.

Blog

Notes from the work.

2026 · JunOn-device AI

AFM 3 Core Advanced: a 20-billion-parameter model that runs on a phone

Apple's new on-device flagship keeps 20B parameters resident but activates only a few billion per prompt. Why it's pinned to the A19 Pro — and what it means for shipping real AI to the devices people already own.

Read →

SoonResearch

Reproducing Apple's KV-cache sharing at 30M parameters

A from-scratch PyTorch reproduction of the technique from Apple's foundation-model paper — and the 37.5% KV-cache memory reduction that matched it exactly.

In progress

The build loop

Not a pipeline — a loop. It runs the way an agent does: observe, act, evaluate, repeat.

No black box. You see every turn, and each cycle's result becomes the next cycle's input — until the criteria are met.

while ( success criteria not met ) {

01 · observe

Read the state

The problem, the data, and whatever the last cycle returned. Context before action — always.

02 · plan

Pick the next move

The smallest step that cuts the most uncertainty — which model, which test, which slice to build next.

03 · act

Build & run it

Every turn ends in something executable you can open, not a status update.

04 · evaluate

Measure, then loop

Quality, cost, latency, failure modes — against the criteria. The result isn't an endpoint; it's the next iteration's input.

} ↻ converged — ship.

The stack

The tools we build with.

Models

Frontier LLMsOpen-weight modelsFine-tuningPrompt systemsMulti-model routing

Retrieval & memory

Vector searchHybrid retrievalSemantic cachingKnowledge graphs

Agents

Tool useOrchestrationBounded loopsWorkflow design

Evaluation & safety

Eval harnessesGuardrailsObservabilityCost governance

Foundation

PythonFastAPIPostgreSQLRedisCloud & edge deploy

About

Enni Technologies is an independent, AI-native company.

We build applied intelligence — systems that reason, retrieve, and act — and we research the problems underneath them, taking on few at a time so each gets full attention.

No growth for its own sake, no buzzwords standing in for results. Just AI engineering and research held to a standard you can measure.

name · origin

எண்ணி

en·ni — Tamil

"To think." It's the whole idea: software that reasons, not software that guesses.

எண்ணி

Have something that has to think?

info@ennitechnologies.com ↗

Company

Enni Technologies

Focus

Applied AI & research