Skip to content

[RFC] Model metadata storage in PTE files via NamedDataΒ #19384

@kirklandsign

Description

@kirklandsign

πŸš€ The feature, motivation and pitch

Motivation

PTE files currently carry model weights and delegate data, but lack a standard way to embed model-level metadata β€” tokenizer config, chat templates, architecture info, etc. Users must ship these as separate files or hardcode them in application code.
A self-contained PTE file that includes all metadata needed to run inference would simplify deployment and reduce integration errors.
PTE already has the infrastructure to support this: NamedData.

Proposal

Use the existing NamedData mechanism (flatbuffer Program.named_data) to store model metadata as key-value pairs. No schema changes needed.

Key naming convention

Follow a namespace.field pattern:

general.name             = "Llama-3.2-1B"
general.architecture     = "llama"
tokenizer.model          = "BPE"
tokenizer.vocab_size     = 128256
tokenizer.bos_token_id   = 128000
tokenizer.eos_token_id   = 128001
tokenizer.chat_template  = "{% for message in messages %}..."
context.length           = 8192

All metadata keys are prefixed with metadata. in the NamedData store to avoid collision with backend entries (e.g., XNNPACK constant tensors).

Value encoding

Values are stored as raw bytes:

  • string β†’ UTF-8 encoded bytes
  • int β†’ int64 little-endian
  • float β†’ float64 little-endian
  • bytes β†’ raw bytes

Python API (export side)

from executorch.extension.llm.export.metadata import add_metadata
edge_manager = to_edge_transform_and_lower(exported_program, ...)
add_metadata(edge_manager, {
    "tokenizer.model": "BPE",
    "tokenizer.vocab_size": 128256,
    "tokenizer.chat_template": template_str,
    "general.architecture": "llama",
})
et_program = edge_manager.to_executorch()

C++ API (runtime side)

auto* map = program->get_named_data_map();
auto result = map->get_data("metadata.tokenizer.chat_template");
std::string_view chat_template(
    static_cast<const char*>(result->data()), result->size());

Why NamedData over a new schema field?

  • Zero schema changes β€” works with existing PTE format
  • Already battle-tested β€” XNNPACK, AOTI, CoreML all use NamedData in production
  • Linear key lookup is fine for ~20-50 metadata keys
  • Prefix-based namespace isolation keeps metadata separate from backend data

Open questions

  1. Should we define a standard set of well-known keys, or leave it fully user-defined?
  2. Should EdgeProgramManager expose a public add_metadata() method instead of going through _named_data_store?
  3. Do we need type tags in the stored bytes, or is type interpretation the caller's responsibility?

Alternatives

No response

Additional context

No response

RFC (Optional)

No response

cc @larryliu0820 @mergennachin @cccclai @helunwencser @jackzhxng

Metadata

Metadata

Assignees

Labels

module: llmIssues related to LLM examples and apps, and to the extensions/llm/ codetriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

Type

No type

Projects

Status

To triage

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions