From MN 4.3, MarginNote rebuilt its data storage from the ground up — content-addressed object pool + Manifest. Apply git's "commit by version, address by hash" approach to your study data. Local integrity, backup, and cross-device sync are all applications. The "mind map corruption" and "lost handwriting" issues users reported for years are structurally fixed because the atomic unit moves from record to manifest.
MN 3.x → 4.2: CloudKit Sync (record-level) + Full Export → Since MN 4.3New data architecture · Auto Backup · Local integrity → Coming Soon: Cloud Drive Sync
Architecture
At the center is the data architecture itself (introduced in MN 4.3), reused in three places: local integrity safety net / Auto Backup / Cloud Drive Sync (Coming Soon). Below, Full Export and CloudKit Sync are legacy mechanisms (MN 4.2 and earlier), preserved alongside the new architecture.
MN 4.3+ · Foundation
Since MN 4.3, MarginNote introduced a versioned data foundation — not just "backup" or "sync", but a foundation reused in three places. This is the real architectural upgrade in this generation.
Content-addressed SHA-256
Every piece of data is addressed by SHA-256 hash — identical content stored once. Files use writeToFile:atomically:YES; on index-insert failure, the just-written file is rolled back.
Snapshot-level atomic unit
Each Study Set snapshot is a Manifest: versionNumber + deviceId + content hash + dependency graph. Sync and recovery operate on manifests, not fields or records.
Metadata atomic commit
Snapshot metadata is committed inside a SQLite transaction. Either the snapshot lands fully, or it never appeared — no half-committed manifests.
⚠ This isn't a single global ACID transaction across CoreData + object pool + snapshot DB. It's "object pool atomic + manifest atomic" as two separate atomicities. The user-facing experience is a version-control safety net, not a database-level ACID journal.
MN 4.3+ Application ① · Local integrity
"Lost handwriting", "mind map corruption", "notes vanished after crash" are MarginNote users' long-standing pain points. The new architecture catches edits as snapshots locally — no remote destination required.
Edit a Study Set → marked dirty → if editing pauses 5 minutes → an immutable snapshot is committed locally.
If object content fails hash verification — bad sectors, corrupted file — verifyIntegrity + rebuildIndex can rebuild the object index.
Repair Missing Baseline Versions (Recommended) — only shown when uncovered Study Sets are detected⚠ Honest boundary: the protection above relies on auto-snapshots being on (the default). If you manually disable auto-snapshot AND have never run a manual / baseline snapshot, "the state from one second before the crash" is not guaranteed by the code to be recoverable. What the architecture gives you is recovery to the most recent complete snapshot, not lossless per-keystroke versioning.
MN 3.x → 4.2 · Legacy preserved
The legacy sync system since 2015. Built on CloudKit private database, keeps same-Apple-ID devices in sync. It's been MN's main sync for years — but it has a structural flaw, explained below.
Study Sets (Topic) / Excerpts & Notes (BookNote) / Media attachments / Document config / FSRS review progress / Tags / Highlight styles
Single PDFs over 256 MB are not pushed to iCloud Documents — to avoid getting stuck.
Change tracking: edit Study Set → mark dirty → background incremental sync
Real-time pull: CloudKit zone subscription listens for remote changes
CloudKit optimistic locking (CKRecordSaveIfServerRecordUnchanged) + field-level merge by hash / timestamp / dirty state. No StudySet-level integrity guarantee — that's the root of the "corruption" issue below.
Token expired → auto-refresh; CloudKit partial failure → per-record handling; stuck? Reset iCloud Sync Buffer for full re-sync.
Why it can "corrupt"
CloudKit Sync's smallest unit is CKRecord. Each Topic (Study Set) and BookNote is a separate record. The technical root of historical "mind map corruption" reports:
Topic.optionsJson) and node positions (BookNote.mindpos) live in different records.CKModifyRecordsOperation.atomic = YES only covers the current batch. A Study Set is split into multiple batches by CloudKit due to resultsLimit / media size.Topic.optionsJson), Batch B fails (with some BookNote.mindpos) → remote layout points to non-existent positions → user perceives "mind map corrupted".This isn't a one-off bug — it's the structural consequence of record-level sync. The new architecture (Cloud Drive Sync) fixes this by lifting the atomic unit to manifest — see below.
📌 Settings path: Settings → Cloud Sync → Note Database Sync Mode → CloudKit Sync. Requires iCloud sign-in.
Available today · Backup
The two paths are independent and can both be on — recommended. Full Export is the long-standing legacy mechanism (manual one-shot full archive); Auto Backup is MN 4.3's new architecture applied in the backup scenario (background incremental versioned).
Manual · One-shot · Long-standing
Settings → Backup & Restore → Full Export, three granularities:
Output: MarginNoteBackup(yyyy-MM-dd-HH-mm-ss).marginbackupall
You manage the package — store it wherever. The system never auto-uploads.
Background · New architecture application · Since MN 4.3
Applies the new data architecture (object pool + Manifest) in the backup scenario — runs at the frequency you set, saves only what changed:
Because it's an application of the new architecture, you get object dedup + content addressing for free. Identical content stored once; partial failure shows as missing objects (repairable), not as field corruption.
Defaults: 5-min idle no-edit triggers a snapshot; pre-sync auto-snapshot; pre-conflict auto-snapshot.
📌 Settings path: Settings → Backup & Restore. The two paths are independent — recommended to enable both.
Next-gen · Sync
The next-generation sync system — mutually exclusive with CloudKit Sync (pick one in Settings). It's not "another CloudKit"; it's sync redone using MN 4.3's new data architecture. The atomic unit moves from record to manifest — and the structural class of issues like "mind map corruption" doesn't occur in the new architecture.
iCloud Drive · Baidu NetDisk · Google Drive
OneDrive / Dropbox interfaces are reserved in code (currently hidden); WebDAV is backup-only, not in the Cloud Drive Sync options.
Every change creates an immutable snapshot: versionNumber + deviceId + content hash all live in the manifest.
Key difference: remote transfer can still half-fail, but it shows as missing objects or corrupted index (repairable) — never as "layout pointing to non-existent nodes".
On a version conflict the UI offers three choices: Use Remote / Merge (default) / Keep Local.
Before merge, an automatic Pre-merge Backup is created — even if you pick wrong, the original is intact.
First-launch on a new device pulls remote snapshots and reconstructs local data per manifest; optionally pulls original documents (PDF/Video) too.
Object index can be rebuilt; missing remote objects can be repaired via baseline-repair.
📌 Current state: code complete, UI gated by internal flag mn5SyncSupported; rollout schedule TBA.
MN 4.2 and earlier vs MN 4.3+ · Data atomicity
These two failure modes have completely different recovery costs. "Corruption" requires the user to manually identify which data is wrong, possibly to roll back or hand-fix. "Missing objects" is visible, locatable, automatically repairable — it tells you which hash didn't download fully and re-fetches. That's the difference MN 4.3's new architecture brings.
Real value when things go wrong
A sync & backup system's real value isn't when it works — it's when it breaks. This section maps recovery paths for every failure mode, so you know which button to press when things go sideways.
Checking for updates / Preparing download / Downloading (n/m) / Preparing upload / Uploading (n/m) / Sync completed / Sync failed
Each state corresponds to a real sync stage — not vibes, actual progress.
On detected version change: three choices — Use Remote / Merge / Keep Local. Default is "Merge", and a pre-merge backup is auto-created.
If CloudKit Sync gets stuck or behaves oddly, you can reset the sync buffer in Settings → Cloud Sync — all data re-compares. Nuclear option, with second confirmation.
Every manual sync auto-backs-up the local DB to Documents/ManualSyncBackups, keeping the latest 5. Even if a conflict gets ugly, the last 5 manual-sync states are recoverable.
If you stop using CloudKit Sync and want to clear cloud data, the settings let you manually delete CloudKit zones — frees iCloud quota, local data unaffected.
The Cloud Drive Sync object pool exposes manual GC + baseline-version repair — fixes orphan objects and missing baselines that long-term usage can produce.
Honest boundaries
"Never lose data" sounds great — but no engineering system can actually promise that. MN chooses not to overclaim, and instead tells you exactly what each layer of protection does.
Sync is eventual consistency based on change tracking — multi-device align in seconds to minutes. It's not Google Docs character-level co-editing. MN doesn't do realtime collab.
Cloud Drive Sync's merge is object-level by manifest / hash / timestamp — better than last-write-wins, but no full common-ancestor lookup yet. So "git for study notes" is a design analogy, not literal git equivalence.
"The architecture itself protects data" relies on auto-snapshots being on (the default). If you manually disable auto-snapshot AND have never run a manual / baseline snapshot, the state from one second before a crash is not guaranteed by the code to be recoverable. What's recoverable is "the most recent complete snapshot".
Cloud outages, full device failure, Apple ID issues, accidental deletes — all of these can happen. The new architecture + Auto Backup + Full Export reduce risk through layered protection, but no "absolute safety" claim. Recommended: periodic Full Export + one copy somewhere you control (external SSD / your own NAS / a cloud account different from your daily one).
About versions: the "new data architecture" described on this page was introduced in MN 4.3. "Cloud Drive Sync" is the next step built on that architecture, currently Coming Soon — the rollout schedule follows the official release notes.
Download MarginNote 4 and put your notes into a tool that treats data safety as an explicit engineering job.