Documentation · v1.0.0

wawa-note

A free, open-source AI workspace that captures meeting evidence — audio, scans, links, notes — and turns it into a searchable project knowledge store with tasks, decisions, and typed relationships. Agentic AI chat navigates your knowledge like a filesystem.

Overview

wawa-note v1.0.0 is a native iOS app built with Swift 6.0 and SwiftUI. It requires iOS 17+ and runs on iPhone. The project is MIT licensed and fully open source.

The core thesis: meeting evidence becomes reusable project memory, and project memory becomes an explorable graph with tasks, decisions, owners, and connected artifacts — all with evidence provenance tracing back to source material.

There are no Wawa Note servers. The app never sees your data. You bring your own API keys. You choose your AI provider. Your knowledge is portable — export everything, import freely.

Architecture

The app follows a layered architecture with protocol-first boundaries. Every external dependency — AI providers, transcription engines, import/export formats, context sensors — sits behind a Swift protocol.

iOS App (SwiftUI)
  Capture │ Inbox │ Explore │ Chat │ Settings
  ─────────────────────────────────────────
Domain Layer
  Agent (Shell VFS + Tool Calling) │ Content Pipeline
  Project Models │ Graph │ Calendar │ Search
  ─────────────────────────────────────────
Provider Abstraction
  OpenAI │ Anthropic │ Gemini │ DeepSeek │ OpenAI-compatible │ Local LLM
  ─────────────────────────────────────────
Storage
  SwiftData (metadata) │ FileManager (artifacts)
  Keychain (API keys)  │ Spotlight (indexing)

The 4-tab navigation is Capture (record, scan, import), Inbox (global search and triage), Explore (project-first workspace browser with Timeline), and Chat (agentic AI chat with tool calling).

No backend. No SaaS. No cloud sync. No accounts.

Project structure

wawa-note/
  App/                 WawaNoteApp.swift
  Audio/               Capture, Playback, Session, FileWriter
  Connectivity/        Watch session, RecordingCoordinator
  ContextCapture/      Calendar, Location, Focus, Motion, Battery, AudioRoute
  Domain/
    Agent/             AgentLoop, ShellInterpreter, ShellTool, ToolContext
    Calendar/          CalendarEvent, CalendarSyncService, Timeline, DaySummary
    Models/            KnowledgeItem, Project, TaskItem, Person, GraphEdge, Entity
    Services/          ContentPipeline, Search, Project, Task, Person services
  Ecosystem/
    Export/            MarkdownExporter, JSONExporter, ProjectExport, TaskReminders
    Import/            ImportRouter + 10 format importers
  LocalIntelligence/   EmbeddingService, SemanticSearchService
  Providers/           AIProvider protocol + OpenAI, Anthropic, Gemini, DeepSeek
  Security/            Biometric gate (Face ID)
  Storage/             FileArtifactStore, SecureKeyStore
  Transcription/       AppleSpeechTranscriptionEngine, RemoteTranscriptionEngine
  UI/
    Capture/           Scanner, Recording UI
    Chat/              ChatView, ChatViewModel, AgentStatusBar
    Inbox/             Universal search + triage
    Explore/           Project explorer
    Project/           Detail, Timeline, Graph, TaskBoard, Decisions, Entities
    Calendar/          MonthGrid, DayActivity, OnThisDay
    Knowledge/         Detail view, Connections
    Settings/          Provider picker, config, templates
    Components/        ContentView, CreationSheet, PermissionPrompt

Agentic chat

The chat system uses a Shell Virtual Filesystem approach. Instead of dozens of individual AI tools, the agent has a single run_command tool that executes Unix-like commands: ls, cd, cat, find, grep, mv, rm, head, wc, touch, echo, extract, history, js-eval, help. This replaced 47 individual tool files with a single ShellInterpreter.

Key features of the agent system:

  • — Context-aware conversations scoped to projects, items, or global
  • — Auto/Deep/Fast modes controlling how much the agent iterates
  • — Voice input via on-device speech recognition or Whisper API
  • — Markdown rendering in messages via AttributedString
  • — Choice prompts — numbered options become tappable buttons
  • — Swipe actions on task/item cards for quick status changes
  • — Suggestion bar on scroll-up with context-aware chips
  • — AgentStatusBar — compact collapsible tool call display

Project graph

Every project has a typed graph connecting knowledge items, tasks, decisions, people, and entities. Graph edges carry provenance — each relationship is traceable to a transcript segment, note block, or external event.

Core models in the graph system:

  • KnowledgeItem — polymorphic model: meeting, note, journalEntry, webBookmark, image
  • Project — container with flexible frameworks (LLM-defined schemas)
  • TaskItem — status, priority, owner tracking
  • Person — people mentioned across meetings and projects
  • GraphEdge — typed relationship with source provenance
  • Entity — systems, organizations, tools extracted from content

Content pipeline

Every item that enters wawa-note goes through a unified pipeline: Extract → Analyze → Detect signals → Ingest. This runs automatically per item and is fully automated.

The pipeline handles:

  • Text extraction from audio (transcription), documents (OCR), and imported files
  • Analysis via configured AI provider — structured output with summaries, entities, and decisions
  • Signal detection — cross-referencing against existing project knowledge
  • Ingestion — persisting extracted entities, graph edges, and metadata into SwiftData

AI providers

All AI providers implement a common protocol. Provider-specific JSON never leaks past the provider layer. Configuration is done through a JSON file (Providers/ai_config.json) that defines models, presets, context windows, and feature settings.

Provider              | Status      | Notes
────────────────────── | ─────────── | ─────
OpenAICompatibleProvider | Primary    | OpenAI, any /v1/chat/completions endpoint
AnthropicProvider      | Implemented | Claude models via Anthropic API
GeminiProvider         | Implemented | Google Gemini models
DeepSeek               | Implemented | DeepSeek models
Local LLM              | Partial     | Model download infra ready, inference not wired yet

A single AIConfigService.shared.requestParams(for:model:) call handles model presets, reasoning model detection (temperature → nil for reasoning models), feature config ceilings, and context window limits. Services only own their system/user prompts — never hardcoded parameters.

Transcription

Two transcription engines are implemented behind a common protocol:

  • AppleSpeechTranscriptionEngine — on-device, works offline, no API cost
  • RemoteTranscriptionEngine — Whisper API via any OpenAI-compatible endpoint

iOS integrations

Integration            | Direction  | Status
────────────────────── | ────────── | ──────────
Calendar read + sensor | IN         | Implemented
Calendar create events | OUT        | Implemented
Reminders export       | OUT        | Implemented
Share Extension        | IN         | Implemented
Format importers (10)  | IN         | Implemented
Export (MD, JSON, CSV) | OUT        | Implemented
Context sensors (7)    | IN         | Implemented
Watch Connectivity     | BOTH       | Implemented
Apple Speech transcript| IN         | Implemented
Live Activities        | OUT        | Implemented
Vision OCR doc scanner | IN         | Implemented
Core Spotlight index   | OUT        | Implemented
Contacts speaker match | IN         | Implemented
Face ID biometric gate | INTERNAL   | Implemented
App Intents / Siri     | —          | Not yet

Calendar integration includes MonthGrid, DayActivity, OnThisDay, and a unified view combining EKEvent items with knowledge items on the same timeline. Seven context sensors (Calendar, AudioRoute, Location, Battery, MotionActivity, FocusMode) feed the content pipeline.

Import & export

wawa-note is designed to be a place you can enter and leave freely. Everything is portable.

Import (10 formats)

JSON, Markdown, ICS (calendar), SRT (subtitles), PDF, HTML, RTF, Plain Text, GitHub Issues — plus a Share Extension that accepts content from any iOS app.

Export

Markdown, JSON, SRT, CSV, Graph JSON. Reminders can be exported to Apple Reminders. Calendar events can be created from meetings.

Use your data anywhere — ChatGPT, Claude, Notion, Obsidian. It's yours.

Security & privacy

There are no Wawa Note servers. The app never sees your data. API keys stay on your device in the iOS Keychain. All permissions are optional — the app works with reduced functionality when permissions are denied.

Key rules

  • — API keys in Keychain only. Never in SwiftData, UserDefaults, JSON, or logs.
  • — Face ID biometric gate available for accessing sensitive content.
  • — Raw audio deletable while keeping transcript and analysis.
  • — Original transcript always preserved — edits saved separately.
  • — Provider config stores only a Keychain identifier, never the key value.
  • — Logs never include API keys, authorization headers, or full transcripts.
  • — All processing indicators show whether work is on-device or remote.

On-device AI

The infrastructure for downloading and managing local AI models is in place (ModelDownloadService and ModelRegistry exist). The planned setup:

Model Size Purpose
Llama 3.2 1B (Q4_K_M) ~500 MB Summarization, task extraction, structured JSON output
EmbeddingGemma 300M (Q4_K_M) ~200 MB Semantic search embeddings

Models download from Hugging Face on first use, verified with SHA256, and stored in Application Support. The existing AIProvider protocol means switching between cloud and local inference requires no changes to the rest of the app. Inference via swift-llama (llama.cpp + Metal GPU) is not yet wired.

Coding standards

  • — Protocol-first boundaries. Every integration behind a protocol: AIProvider, TranscriptionEngine, FormatImporter, ContextSensor.
  • — Swift 6.0 concurrency: async/await, @MainActor for UI state, @preconcurrency import for unaudited Apple frameworks.
  • — All AI requests must use AIConfigService.shared.requestParams(for:model:) — never hardcode temperature or maxTokens.
  • — Provider-specific JSON stays inside the provider. App code uses internal models only.
  • — Provenance on every graph edge — relationships traceable to source evidence.
  • — SwiftData: no @Relationship cascade deletes. Manual recursive delete in service layer.
  • — Keep SwiftUI views thin. Services testable without UI. Dependency injection through initializers.
  • — 27 unit tests in the test target. Do not claim something works on device until tested there.

What's next

v1.0.0 is stable and available now. Areas still in progress:

  • — On-device LLM inference via swift-llama (download infra ready, inference not wired)
  • — Semantic search UI (EmbeddingService exists, not surfaced yet)
  • — Cross-reference results persisted as GraphEdges (currently ephemeral DTOs)
  • — Device validation on iPhone 14 Plus hardware
  • — App Intents / Siri integration (needs extension target)

Track progress and contribute on GitHub.