§ Architecture

Seven layers.
One graph.

Deterministic where correctness matters. LLM where semantic judgement matters. Everything runs on a device you own, grounded in a local knowledge graph. Only prompts leave the premises.

§ 01 Positioning

Maxy is the Private Network Intelligence layer a knowledge worker can run on their own hardware. A graph-backed memory model grounds every response. A plugin architecture scopes tool access per domain. A hard split between admin and public agents enforces data boundaries by design. Everything runs on commodity hardware with zero cloud dependency.

§ 02 Stack

The stack, band by band.

Neo4j Graph

Your premises

Interfaces

WhatsAppWeb ChatTelegramEmail

Agents

Admin AgentPublic AgentsSpecialist Sub-agents

Execution

Projectsatomic lifecycleTasksgraph-backedWorkflowsdeterministic

Plugins & Capabilities

Toolsdomain-scopedSkillsstructured guidesHooksenforced rulesCronscheduled triggers

Intelligence

Maxy handles

Tool routingKnowledge scopingWorkflow execution

Claude handles

UnderstandingSynthesisGeneration

Ontological Substrate

Neo4j Knowledge GraphVector EmbeddingsConnection strength scoreforward-looking

Device

Raspberry Pi / Mini PCLinuxOn-premises

prompts / responses

Claude APIAnthropic

§ 03 Layers

How the layers work together.

Layer 1. The device

A Raspberry Pi or Mini PC running Linux on your premises. This is the system boundary. Your data does not leave the device. Only LLM prompts and responses traverse external connections to Anthropic's API. Everything else runs locally: the graph, the agents, the workflows, the plugins.

Sealed local graph with public-key identity

Each Maxy node carries a key pair generated at setup. The graph is partitioned per account: every node and edge is scoped to the operator who owns it, with no shared namespace across devices. A future federated mode (forward-looking) layers a peer-to-peer mesh on the same primitives. Nodes broadcast warm-intro requests to peers; peers query their own local graphs privately and choose whether to surface an offer. No graph data is shared, only intro offers. Federation is built in: the architecture does not need rebuilding to reach the federated state. The first release ships local-only because the primitive is valuable on its own.

Layer 2. Ontological substrate

Neo4j graph database and vector embeddings form the ontological substrate. This is not a data store bolted onto an LLM. It is the ground truth the entire system reasons against. Entity relationships, business rules, customer history, document knowledge, and agent memory all live here. Vector embeddings enable semantic retrieval; graph relationships enable context traversal. Neither alone is sufficient.

The name matters. Most systems call this layer “memory” or “the database” and treat it as storage. Maxy calls it what it actually is: the ontological substrate the rest of the system depends on. At the scale and complexity of a real relationship archive, the creation and interpretation of this substrate is only possible with AI. No human can hand-model every contact, every commitment, every thread at the fidelity this requires. That is why the substrate is the foundation, not a side-effect of the application.

Connections, typed and scored

Relationships are stored as first-class objects, not patterns to be re-extracted from text on every query. Knowing someone, working at a company, introducing one person to another: each is a distinct kind of connection with its own properties. The date a connection was made on LinkedIn, the current or past flag on a job: each property sits on the connection it describes, not duplicated across the people involved.

Connection strength is a number. A composite score is computed across every connection in the graph, refreshed nightly. Seven signals feed it: how recently the two of you have been in touch, how often, whether the back-and-forth is balanced, how many channels the relationship runs across, who reaches out first, how deep the engagement runs, and whether you publicly engage with each other's work. The score makes "warmest connection at this company" a real answer rather than a guess. Forward-looking.

Layer 3. Intelligence

The intelligence layer is where ontological grounding meets LLM reasoning. Claude (via the Anthropic API) handles language understanding and generation. Maxy handles everything else: resolving the relevant knowledge scope from the graph, filtering available tools to the resolved domain, assembling grounded context, and routing responses. LLM reasoning is applied only where it adds genuine value: interpretation, synthesis, judgement. Retrieval, routing, and execution are deterministic.

The graph schema itself is injected into every relevant agent's system prompt. The LLM keeps its semantic judgement but cannot generate out-of-schema entities or edges. Deterministic boundaries with intelligent judgement inside them, not an either-or. This is how Maxy retains the elegance of LLM-authored writes without paying the hallucination tax.

Layer 4. Plugins and capabilities

Maxy ingests from email (IMAP / OAuth), calendar (CalDAV / OAuth), LinkedIn export, WhatsApp plus Android msgstore, Signal Desktop SQLCipher plus mobile exports, Telegram JSON export, and Substack subscriber, post, and comment archives. Each is an opt-in plugin. Seven sources of relationship signal feeding the same local graph. Data never leaves the device.

Layer 5. Execution

Projects are first-class graph entities with dedicated tools that operate atomically. A single call provisions the project node, all child work items, dependency links, and stakeholder relationships in one transaction. The design is grounded in PMBOK 7th Edition and APM Body of Knowledge 5th Edition, adapted for conversational use: lifecycle phases track progression from planning through completion, server-side health signals surface blockers and overdue items as structured data, and three execution tiers (quick, standard, full) match operational complexity to the work at hand.

Tasks and Workflows complete the execution layer. Dependency tracking (BLOCKS relationships) and conflict detection (AFFECTS relationships) are graph-native. Task sequencing and workflow steps follow graph-defined rules, not model discretion.

Layer 6. Agents

Three agent tiers with formally separated access: Admin (owner-facing, full graph access, full tool access), Public (customer-facing, scoped to public knowledge only), and Specialist sub-agents (domain-specific execution dispatched by the admin agent). Each tier is a Role in the ontological sense: defined capabilities, defined knowledge access, defined interaction boundaries. Agents cannot escalate their own permissions.

The admin tier currently fields six specialists: a project manager, a personal assistant, a research assistant, a content producer, a coach, and a database operator. The database operator is the role that makes the Data to Information rung of the DIKW ladder feasible. It normalises every captured surface into nodes in the graph and keeps every edge current without a minute of manual upkeep. Without it, the other five specialists would have no substrate to reason against.

Layer 7. Interfaces

WhatsApp, Web Chat, Telegram, Email. All channels share the same underlying graph. A customer conversation on WhatsApp and a task created via the web interface exist in the same knowledge space. Channel is irrelevant; context is always whole.

§ 04 Principles

Key design principles.

Determinism where correctness matters

Tool routing, data access, workflow sequencing, and capability boundaries follow formal rules. The model cannot override them.

LLM reasoning where judgement matters

Language understanding, synthesis, and contextual response generation are handled by Claude, applied to grounded, locally-held data.

Schema-bounded LLM judgement

The graph schema is injected directly into every relevant agent's system prompt. The LLM keeps its semantic judgement but cannot generate out-of-schema entities or edges. Deterministic boundaries, intelligent judgement inside them.

The graph as ontological substrate

Every entity and relationship lives in Neo4j. Agent reasoning is grounded against this at runtime, not assembled from parametric model knowledge. This is why Maxy performs well in domain-specific contexts where general LLMs are weakest.

Scoped knowledge, not open access

Public agents see public knowledge. Admin agents see everything. Documents, conversations, and contacts are scoped at write time. No agent can retrieve data outside its defined scope.

Privacy by architecture

On-premises deployment is not a feature. It is a structural guarantee. The device boundary is the privacy boundary.

This places Maxy at L2 to L3 of the neurosymbolic coupling maturity model (Tuan, 2026): ontology-constrained tool discovery and knowledge-grounded reasoning, rather than the prompt-injection approach common in competing architectures.

§ 05 Contrast

Where the architecture is deliberately different.

Two concurrent signals validated the category in April 2026. Garry Tan, CEO of Y Combinator, open-sourced GBrain, his personal AI memory system: a Markdown brain repository with Git on top, Postgres with pgvector for retrieval, and an agent-skills layer running over OpenClaw and Hermes. Andrej Karpathy published a parallel Markdown-first pattern in a public gist the same month. Maxy is aimed at the same outcome. Each of the architectural choices below differs deliberately.

Graph-first, not Markdown-first

Data becomes information only when it is connected. Ten thousand loose Markdown files with Git on top is how a developer's personal rig grows up. It is not how a relationship archive stays coherent at scale. Maxy persists to Neo4j with vector embeddings, treating the graph itself as the source of truth and letting the operator read through conversation rather than through a file browser.

Schema-bounded LLM judgement, not zero-LLM link extraction

GBrain extracts typed relationships (attended, works_at, invested_in) with zero LLM calls on every write. That is a reasonable avoidance of hallucination in a system with a loose schema. Maxy takes the more elegant route: the graph schema is injected directly into every relevant agent's system prompt. The LLM keeps its semantic judgement but cannot generate out-of-schema entities or edges. Deterministic boundaries with intelligent judgement inside them, not an either-or.

Conversational surface, not agent-framework developer product

GBrain runs behind OpenClaw and Hermes. The operator maintains their own cron jobs and skill pack. Maxy is Claude Code wrapped for the non-technical operator and reached through chat: WhatsApp, Telegram, web. Same agent-framework paradigm underneath. Packaged surface on top.

On-device, not cloud Postgres

GBrain's default deployment lands Postgres on a developer machine or a cloud instance. Maxy ships the graph, the vector index, and the agent on a private appliance under the operator's control. The device boundary is the privacy boundary.

§ 06 Security

Security & data protection.

Inbound message screening

Every inbound message, regardless of channel, passes through a centralised screening gateway before reaching the agent. The gateway detects prompt injection patterns, classifies intent, and assigns a safety verdict. Unsafe messages on public channels are refused without agent invocation. Admin messages receive advisory screening only: flagged, never blocked.

Message→Inbound Gateway→Screen + Classify→Agent

Agent action audit trail

Every tool invocation by the admin agent produces a durable audit record in the knowledge graph: tool name, input, output, timestamp, and originating conversation. Records persist indefinitely and are queryable in conversation. This is not a debug log. It is a first-class data structure designed for accountability and operational review.

GDPR compliance by architecture