enni.
எண்ணி எண்ணி · to think

We build software that has to think.

Enni Technologies is an AI-native company. We build and research applied intelligence — LLM systems, agents, and on-device AI — designed to hold up in production, not just demo well.

Focus
Applied AI
Systems
LLMs · Agents · On-device
Method
Eval-driven
What we build

Intelligence, applied — and held to a standard you can measure.

We take a small number of hard problems and go deep, building systems that stay reliable once real users and real data arrive.

Applied LLM systems

Retrieval, knowledge assistants, and structured generation that stay grounded in your data — measured, bounded, and cost-aware rather than impressive once and unreliable after.

  • RAG & retrieval architecture
  • Grounded assistants & copilots
  • Structured, schema-safe generation
RAGVector searchGroundingStructured outputCost governance

Agents & autonomous workflows

Agentic systems that take real actions inside real constraints — bounded loops, tool use, and evaluation built in from the first line, so autonomy never means unpredictability.

  • Tool-using agents & orchestration
  • Evals, guardrails & observability
  • Human-in-the-loop where it matters
AgentsTool useEvalsGuardrailsOrchestration

On-device & private AI

Models that run where the data lives — fast, private, and offline-capable. For products where latency, cost, or confidentiality rule out a round trip to someone else's cloud.

  • Local & edge inference
  • Quantization & model selection
  • Privacy-preserving architecture
On-deviceEdge inferenceQuantizationPrivacy

AI strategy & architecture

The foundational decisions, made well — model strategy, evaluation design, and system architecture so a serious AI build rests on the right footing from the start.

AI system designEval strategyModel selectionArchitecture
Research

We don't just apply intelligence — we investigate it.

Research isn't a separate department here. The open questions we work on feed directly into what we build, and what we ship sends us back to the questions.

Efficient inference

Running capable models inside tight latency, cost, and privacy budgets — on device and at the edge.

Retrieval & grounding

Keeping generated output anchored to verifiable sources, and knowing when a system should say "I don't know."

Agent reliability

Making autonomous systems measurable and predictable — evaluation, failure modes, and bounded behavior.

Blog

Notes from the work.

2026 · JunOn-device AI

AFM 3 Core Advanced: a 20-billion-parameter model that runs on a phone

Apple's new on-device flagship keeps 20B parameters resident but activates only a few billion per prompt. Why it's pinned to the A19 Pro — and what it means for shipping real AI to the devices people already own.

Read →
SoonResearch

Reproducing Apple's KV-cache sharing at 30M parameters

A from-scratch PyTorch reproduction of the technique from Apple's foundation-model paper — and the 37.5% KV-cache memory reduction that matched it exactly.

In progress
The build loop

Not a pipeline — a loop. It runs the way an agent does: observe, act, evaluate, repeat.

No black box. You see every turn, and each cycle's result becomes the next cycle's input — until the criteria are met.

while ( success criteria not met ) {
01 · observe

Read the state

The problem, the data, and whatever the last cycle returned. Context before action — always.

02 · plan

Pick the next move

The smallest step that cuts the most uncertainty — which model, which test, which slice to build next.

03 · act

Build & run it

Every turn ends in something executable you can open, not a status update.

04 · evaluate

Measure, then loop

Quality, cost, latency, failure modes — against the criteria. The result isn't an endpoint; it's the next iteration's input.

} ↻ converged — ship.
The stack

The tools we build with.

Models
Frontier LLMsOpen-weight modelsFine-tuningPrompt systemsMulti-model routing
Retrieval & memory
Vector searchHybrid retrievalSemantic cachingKnowledge graphs
Agents
Tool useOrchestrationBounded loopsWorkflow design
Evaluation & safety
Eval harnessesGuardrailsObservabilityCost governance
Foundation
PythonFastAPIPostgreSQLRedisCloud & edge deploy
About

Enni Technologies is an independent, AI-native company.

We build applied intelligence — systems that reason, retrieve, and act — and we research the problems underneath them, taking on few at a time so each gets full attention.

No growth for its own sake, no buzzwords standing in for results. Just AI engineering and research held to a standard you can measure.

name · origin
எண்ணி
en·ni — Tamil
"To think." It's the whole idea: software that reasons, not software that guesses.
எண்ணி

Have something that has to think?

info@ennitechnologies.com
Company
Enni Technologies
Focus
Applied AI & research