I Built a Second Brain. It Already Matched Google's Standard (OKF)

I built a second brain. It turned out to be 100% on Google's standard (OKF)

A while ago I wrote about how Karpathy's LLM Wiki helped me organize my knowledge base - over 300 notes managed by an agent, three navigation indexes, workflows for ingest and compilation. The system works. Every day. It's my second brain.

But recently I caught myself on an uncomfortable question. All that knowledge lives in my tools - in Obsidian, in Quartz, in my CLAUDE.md. So what if I want to hand it to someone else? Switch tools? Or treat it as a company asset meant to outlive any program and any model? Notes locked in someone's private format are vendor lock-in - except self-inflicted.

The conclusion I reached is simple: durability of knowledge isn't the tool - it's the format. That's when I came across OKF (Open Knowledge Format) by Google. I audited my brain against this standard, expecting a sizable gap. The result surprised me: compliance around 100% - even though I never designed the brain for OKF. In this article I'll show why that happened, what exactly I checked, and what happens when plain markdown isn't enough. No theory - a concrete audit and concrete numbers.

What OKF (Open Knowledge Format) is

OKF is an open format for recording knowledge, designed for the agent era. Technically there's no magic in it: it's markdown + YAML frontmatter. The whole philosophy fits in one tagline of the pattern - „authored by people, generated by agents". Exactly the split I'd arrived at organically.

The base unit is a Knowledge Bundle - a directory of .md files. It can be a git repository, a tarball, or a subdirectory. And that's the heart of portability: git clone and you have the whole thing. Inside, a Concept is a single thought - one .md file = one concept (frontmatter plus markdown content). The format reserves two filenames: index.md (a directory listing for progressive disclosure) and log.md (a chronology of changes, newest on top).

The most interesting part is the minimal barrier to entry. The only hard-required frontmatter field is type. Everything else - title, description, resource, tags, timestamp - is recommended but optional. A producer may add its own keys, and a consumer must tolerate unknown fields and broken links. The simplest compliant concept looks like this:

---
type: note
---
Note content in plain markdown.

It's worth understanding the difference between OKF and LLM Wiki. OKF is an interoperability specification - it says „how to record knowledge so it's exchangeable". LLM Wiki is a working methodology - it says „how an agent should build and maintain that knowledge". Two complementary layers, not competitors.

One honest note. The OKF repository carries an explicit disclaimer: „This repository and its contents are not an official Google product". OKF is an open interoperability spec, not a product commitment from Google. I'm writing this deliberately - credibility comes from precision, not from overstating.

My brain matched the standard - even though I never designed it for it

This is the heart of the story. I ran a real compliance audit of the brain against OKF v0.1. The verdict in one sentence: the brain passes 100% of the hard conformance conditions of OKF v0.1; the discrepancies are purely in the recommended layer - cosmetics.

The spec defines three hard conformance conditions. The brain meets all three:

#	OKF requirement	Status	Evidence in brain
1	Every non-reserved .md has parsable YAML frontmatter	✅	Every note opens with --- ... ---
2	Every frontmatter has a non-empty type	✅	type: basic-note \| book-note \| knowledge-note \| tool \| compiled-note \| answer-note
3	Reserved files have the proper structure when they exist	✅	Brain doesn't use literal index.md/log.md → that role is filled by _indexes/*; the „when they exist" condition is met

The numbers out loud: hard compliance = 100% (3/3). The recommended layer (all those optional fields and conventions) is ~85-90%. And here it gets interesting, because all four discrepancies are naming-and-syntax issues - they're about what I named something, not how I model knowledge:

Area	OKF	brain	Nature
Links	[label](/path)	[[wikilinks]] (Obsidian)	syntax; Quartz compiles to [label](path) at build
Index	literal index.md	_indexes/vault-map.md + catalog.md + graph.md	name, not function
Log	log.md	a ## Recent Changes section in _indexes/vault-map.md + git history	role in a section, not a separate file
Frontmatter keys	description / resource / timestamp	summary / source / date	1:1 mappable on export

Look at the „Nature" column. Nowhere does it say „I'm missing this concept" - everywhere it's „I named it differently" or „the tool compiles this for me". Obsidian wikilinks are full-fledged OKF links after the Quartz build. _indexes/ plays exactly the role of index.md. The fields map one to one.

„Compliance isn't a coincidence - the brain and OKF grow from the same LLM Wiki concept by Karpathy. OKF is an interoperability spec, the brain is a working implementation. The differences are tool choices, not differences in the knowledge model."

This is convergent evolution. Two systems built independently - a spec inside Google and my Obsidian vault - converged on the same shape because they grew from the same idea. And that's precisely why the standard is credible: it wasn't invented from behind a desk, it was written down from where practice already heads.

Why the format matters

You might ask: if my system works, why do I need a standard at all? The answer is five concrete things the format gives you - and that no single tool can:

Portability. „If you can git clone it, you can ship it." A bundle is a repository - you copy the whole thing and it works for anyone, no install, no config, no dependence on my setup.
Interoperability. A knowledge bundle can be exchanged regardless of tool. Export from my brain, import into someone else's. This is a real future product direction - an extract of a knowledge base as a ready OKF bundle.
Durability and future-proofing. Tools die, the format stays. Markdown plus frontmatter will outlive Obsidian, outlive Quartz, outlive a specific LLM. In five years you'll still open these files.
Dual readability. The same file is read by an agent and a human. Without any tool you just open it in an editor - it's not a base in a closed binary format.
An asset, not a silo. Knowledge recorded in a standard becomes a resource you can audit, hand off, and value. And that's the bridge to the next question - what happens when this asset grows to company scale.

When plain markdown isn't enough - Google Knowledge Catalog

Here I switch to a cautious tone. I have not tested this part - I'm signaling a direction, not issuing a recommendation.

Karpathy himself sketches the scale where this approach works: „moderate scale (~100 sources, ~hundreds of pages)" - before embedding-based search becomes necessary. My brain is on the order of 300 notes today, still in that range: plain markdown with indexes handles it without embeddings and without RAG infrastructure. In the previous article I showed this mechanism in numbers: progressive disclosure gives about 30x less context per query than dumping the whole vault. This isn't a hard limit - more an order of magnitude up to which the index-first approach stays comfortable. For a personal second brain it's plenty. But above that - millions of documents, structured and unstructured data at once, many agents working in parallel - it's a different league.

This is where Google Knowledge Catalog comes in (managed, per the product page „formerly Dataplex"). Google describes it as a „universal context engine for your enterprise" - „always-on context and governance for your agents". Instead of your notes, it catalogs the whole data estate: it automatically harvests metadata from BigQuery, AlloyDB, Spanner, or Looker, and turns unstructured sources - PDFs, contracts, wikis - into a „structured knowledge graph" that an agent can query. The same repository (GoogleCloudPlatform/knowledge-catalog) that contains the okf/ directory also holds samples/ and toolbox/.

There's continuity here, not a leap. The same idea - knowledge standardized, semantic, legible to agents - just at enterprise level. OKF and LLM Wiki are personal scale; Knowledge Catalog is company scale. One chain, two ends.

Where exactly is the boundary? It's not „local vs cloud" - my brain runs in the cloud too. The difference is qualitative, along four axes:

What you catalog. Brain = your authored notes. Catalog = a semantic layer over the company's entire, living data estate: tables, warehouses, files, BI models.
Retrieval engine. Brain = index-first plus progressive disclosure, no embeddings. Catalog = semantic search with sub-second latency over a knowledge graph of millions of entities.
Governance. Brain = git and conventions. Catalog = access policies (IAM), data quality, lineage, audit - retrieval respects permissions, so an agent only sees what it's authorized to.
Freshness and concurrency. Brain = a snapshot you maintain yourself. Catalog = always-on, updates with the data, and serves many agents at once through Context APIs and MCP tools.

The file count is just a symptom. The real boundary is data type plus retrieval engine plus governance plus dynamics. For personal knowledge in prose, plain markdown wins on simplicity; for a company's heterogeneous data estate you need a layer like Knowledge Catalog.

And once more, because it matters: the OKF repository bears the note „not an official Google product", and I describe Knowledge Catalog here strictly as a direction I haven't deployed myself. No promises about product features.

What follows from this

Format beats tool. If you're building a knowledge base, design it for a standard from day one. Migrating later costs - compliance from the start is free.
Compliance can be convergent. If your system grows from a good idea (LLM Wiki), you have a shot at hitting the standard without aiming for it. That's good news: you probably don't need to rewrite your brain, just export it.
Knowledge in a standard is an asset. Portable, auditable, exchangeable - and scaling from a personal brain up to Knowledge Catalog.

If you want to start on a ready foundation, I share a second-brain-template - „Use this template", /onboard, first ingest, and you have your own standard-compliant system in minutes. I'm also working on broader materials about building a second brain - if the topic interests you, it's worth keeping an eye out.

Want a knowledge base that's yours forever - portable and ready for agents?

I'll help you design a knowledge base architecture aligned with the standard (OKF) - from note structure and indexes to company scale. You can also grab the free template and spin up your own system in minutes.

Book a free consultation

Useful Resources

How Karpathy's LLM Wiki helped me organize my knowledge base - the base article on LLM Wiki, progressive disclosure, and the brain's architecture
Second brain in Obsidian and Claude Code - how to start from scratch
brain.lipowczan.pl - the live instance of my second brain
second-brain-template - a ready template to start with
OKF / knowledge-catalog (repo) - the OKF spec in the okf/ directory (note: „not an official Google product")
Google Knowledge Catalog - a managed data catalog platform (enterprise scale)

FAQ

What is the Open Knowledge Format (OKF) and who is behind it?

OKF (Open Knowledge Format) is an open specification for recording knowledge based on markdown plus YAML frontmatter, designed for working with AI agents. The spec lives in the GoogleCloudPlatform/knowledge-catalog repository, in the okf/ directory. Important note: the repository carries a „This repository and its contents are not an official Google product" disclaimer - it's an open interoperability standard, not a product commitment from Google.

How does OKF differ from Karpathy's LLM Wiki?

OKF is an interoperability specification - it defines how to record knowledge so it's portable and exchangeable between tools. LLM Wiki is a working methodology - it describes how an agent should build and maintain a knowledge base. They're complementary: OKF is about the format, LLM Wiki about the process. A working knowledge base uses both layers at once.

The only required field in OKF is type - what does that mean in practice?

It means a minimal barrier to entry: a compliant OKF file only needs a non-empty type field in its frontmatter, while everything else (title, description, tags, timestamp) is optional. A producer can add any custom keys, because a format consumer must tolerate unknown fields and broken links. This makes the standard liberal on write and strict only where it has to be.

Is my existing Obsidian knowledge base compliant with OKF?

Most likely in large part yes, especially if you use .md files with frontmatter. In my audit the brain passed 100% of the hard conformance conditions of OKF v0.1 (3/3), and the discrepancies were purely cosmetic. Obsidian [[...]] wikilinks compile through Quartz to [label](path) links, and fields like summary or date map one to one onto description and timestamp on export.

When does plain markdown stop being enough and Google Knowledge Catalog steps in?

The index-first approach on plain markdown works great at personal scale - on the order of hundreds of notes (my brain is a bit over 300 today), without embeddings and without RAG infrastructure. It's not a hard limit, just an order of magnitude. Above that - with millions of documents, structured and unstructured data, and many agents at once - you reach enterprise scale, which is what Google Knowledge Catalog (managed, „formerly Dataplex") addresses. I should stress, though, that I haven't tested this threshold personally - I'm signaling a direction, not a product recommendation.