What counts as the same instrument when every broker speaks a different dialect? • David Beveridge

This started as a simple question about securities taxonomy and turned into a much better platform question: what exactly is the thing my system thinks it is trading when one broker says asset_class, another says secType, and a third wants a completely different execution identifier?

The first version of this problem sounds administrative. Make a list of security types. Map them across brokers. Move on.

I do not think that framing survives contact with a second provider.

The real problem is not naming categories. The real problem is deciding what the platform treats as canonical. If the core model quietly becomes "whatever Alpaca calls this thing," then adding Schwab, Interactive Brokers, or a futures venue later turns into a long cleanup disguised as integration work. The platform starts leaking provider assumptions into order validation, position tracking, search, symbology, and even the way users think about what an instrument is.

That is the part I keep coming back to: in a multi-provider trading system, the dangerous mistake is not having an incomplete list of securities. It is confusing instrument identity with provider execution representation.

The first mistake is turning a broker label into your ontology

Every provider has some version of type information, but it does not all mean the same thing. One surface is optimized for execution, another for market data lookup, another for compliance or approval gating, and another for contract identity. They can overlap without being interchangeable.

That is why I do not think a platform should promote a provider-native field like security_type, asset_class, or secType into the center of the domain model. Those fields are useful. They are just not the whole truth. They usually collapse several different concerns into one label:

Layer	Question it answers	Examples
Canonical instrument identity	What economic object is this, independent of broker quirks?	US common equity, listed equity option, perpetual crypto pair, front-month futures contract.
Provider execution representation	What identifier and shape does this broker need in order to route the order?	Provider contract id, option symbol format, venue code, native asset type.
Capabilities and constraints	What can this account do with this instrument here?	Fractional enabled, shortable, marginable, option approval required, closing-only.

If those layers get merged, the system starts lying. A stock and an ETF may both look like "equity" for one provider while needing different handling somewhere else. A provider may expose an option as a symbol string that looks complete, but the platform still needs the underlying, expiry, strike, multiplier, exercise style, and settlement assumptions if it wants to reason correctly about that contract later. A crypto pair may be tradable at one venue and meaningless at another unless the venue and quote currency are carried explicitly.

A cleaner model has three seams

My current read is that the core model should keep three seams brutally clear.

First, there should be a canonical instrument record. That is the platform's answer to "what is this thing?" It needs a stable internal id, a family or class, identifiers, venue or listing context where relevant, currency context, and contract specifications for anything derivative.

Second, there should be a provider mapping layer. That is where the platform says "when this canonical instrument is routed through provider X, use these execution ids, this symbol form, this venue, and these translation rules."

Third, there should be a capabilities layer. That is where the platform says "this account at this provider can open these order types for this instrument family, but only under these restrictions."

That separation feels boring, which is usually a good sign. Good platform seams often feel obvious right after you say them out loud and expensive right before you implement them.

Symbology is a resolver problem, not the primary key

The symbol problem is what usually exposes weak models first. Humans type AAPL. A provider may want a contract id. Another may want an OCC-style option symbol. Another may allow a plain ticker but only if the primary listing can be inferred correctly. None of that means the user-facing input should become the canonical identity.

I think the platform needs an explicit resolver pipeline instead:

Take user input or upstream strategy input.
Resolve it into a canonical instrument id using symbol, venue, asset family, and contract details.
Attach all known external identifiers to that canonical record.
Translate from canonical record to the exact provider-native execution representation at routing time.

That sounds like extra machinery until you hit the alternatives. Without a resolver, the platform ends up pretending that a user symbol is globally unique. It is not. Tickers collide. Listings differ by venue. Corporate actions rename symbols. Options compress critical metadata into symbol formats that are convenient for screens and terrible as the only durable identity.

The resolver is where the platform earns the right to stay sane. It is also where multiple identifier families can coexist without taking over the whole model: ticker, FIGI, ISIN, CUSIP, OCC symbology, venue codes, provider contract ids, and whatever internal id the platform uses for durable storage.

Derivatives are where loose models start lying

You can get away with fuzzy thinking longer in cash equities than you can in derivatives.

The moment the platform trades options, futures, or more exotic contracts, the model has to admit that "symbol plus type" is not enough. A listed option without expiry, strike, put/call, multiplier, exercise style, and settlement assumptions is not really modeled. It is only named.

That is why I like the idea of using something like ISO 10962 CFI as a classification spine rather than as the whole model. A classification code can help keep the taxonomy disciplined. It does not remove the need for explicit contract fields. The platform still has to know what makes one contract economically distinct from another.

instrument
  id
  family
  subtype
  primary_listing
  base_currency
  quote_currency
  identifiers[]
  underlying_instrument_id
  contract_spec {
    expiry
    strike
    option_kind
    exercise_style
    multiplier
    settlement_type
  }

I would rather see a small model like that, with room to grow, than a giant enum pretending to solve identity by itself.

The provider layer should absorb quirks so the core does not

The translation layer is the part that lets the core stay clean. If Alpaca, IBKR, Schwab, or a future exchange all describe the same economic object differently, the provider adapter should carry that burden, not the strategy engine and not the user-facing platform model.

That also points to a useful test for plugin or manifest design. If the manifest can only declare "this provider supports equities and options," that is not enough. The platform eventually needs it to describe things like identifier types, preferred lookup paths, required contract fields, asset-family support, and capability matrices.

I would want the plugin layer to answer questions like these in a structured way:

What identifier does this provider consider authoritative for routing?
Can it resolve from ticker, or does it need a richer contract lookup first?
Which asset families are supported at all?
Which order types, time-in-force values, approvals, and restrictions apply by family or venue?
Which fields are mandatory for derivatives or other contract-heavy instruments?

Once those answers live in a provider-facing layer, the rest of the platform can become much simpler. Search resolves into canonical identity. Routing resolves into provider-native ids. Risk checks and validation can ask capabilities questions without pretending they are taxonomy questions.

Three examples make the point faster than another abstract definition

Instrument	Canonical view	Provider-specific view
AAPL common stock	Equity instrument, issuer-linked, listed on a primary venue, common-share subtype.	One provider may accept a ticker plus venue; another may require a contract id or internal security handle.
SPY call option	Derivative on SPY with explicit expiry, strike, call/put flag, multiplier, and exercise style.	Routing may require an OCC-style symbol at one provider and a contract lookup id at another.
BTC/USD spot pair	Two-currency instrument with explicit base and quote assets plus venue context if needed.	Tradability, hours, min order size, and symbol format can vary heavily by provider or venue.

Those are not edge cases. They are the normal cases once a platform stops being single-provider.

My current synthesis

I started this question thinking mostly about taxonomy. I ended it thinking mostly about boundaries.

The platform does need a canonical taxonomy, but not in the shallow sense of "make a bigger list of security types." It needs a canonical instrument identity model that can survive multiple providers, multiple identifier families, and contract-heavy instruments without leaking execution quirks into the core domain.

So my current read is straightforward:

use a canonical instrument record as the system of truth
treat symbols and external ids as resolver inputs and mappings, not as identity by themselves
model derivatives explicitly instead of hiding them behind type labels
push provider quirks into translation layers and manifest-declared capabilities
let taxonomy answer "what is this?" and let capabilities answer "what can this account do with it here?"

That is the design I trust more than a one-big-enum approach. It is smaller where it matters, stricter where it matters, and much more likely to survive the day someone says, "Great, Alpaca works. Now add Schwab, IBKR, and whatever comes next."