Skip to content

Conversation

@sahar-fehri
Copy link
Contributor

@sahar-fehri sahar-fehri commented Dec 9, 2025

Description

Optimizes TokenListController storage to reduce write amplification by persisting tokensChainsCache via StorageService using per-chain files instead of a single monolithic state property.

Related: https://github.com/MetaMask/metamask-mobile/pull/22943/files

Related: https://github.com/MetaMask/decisions/pull/110

Related: #7192

Explanation

The tokensChainsCache (~5MB total, containing token lists for all chains) was persisted as part of the controller state. Every time a single chain's token list was updated (~100-500KB), the entire ~5MB cache was rewritten to disk, causing:

  • Startup performance issues (loading large state on app initialization)
  • Runtime performance degradation (frequent large writes during token fetches)
  • Impacts both extension

Solution

Per-Chain File Storage:
Each chain's cache is now stored in a separate file (e.g., tokensChainsCache:0x1, tokensChainsCache:0x89)
Only the updated chain (~100-500KB) is written on each token fetch, reducing write operations by ~90-95%
All chains are loaded in parallel at startup to maintain compatibility with TokenDetectionController
Key Changes:

  • Set tokensChainsCache metadata to persist: false to prevent framework-managed persistence
  • Added #loadCacheFromStorage() to load all per-chain files in parallel on initialization
  • Added #saveChainCacheToStorage(chainId) to persist only the specific chain that changed
  • Added #migrateStateToStorage() to automatically migrate existing cache data on first launch after upgrade
  • Updated clearingTokenListData() to remove all per-chain files

References

Checklist

  • I've updated the test suite for new or updated code as appropriate
  • I've updated documentation (JSDoc, Markdown, etc.) for new or updated code as appropriate
  • I've communicated my changes to consumers by updating changelogs for packages I've changed
  • I've introduced breaking changes in this PR and have prepared draft pull requests for clients and consumer packages to resolve them

Note

Persist tokensChainsCache using per‑chain files via StorageService with migration, parallel load, and updated tests and configs.

  • Assets Controllers – TokenListController:
    • Persist tokensChainsCache via StorageService using per‑chain keys (e.g., tokensChainsCache:<chainId>); set state metadata persist: false.
    • Add initialization flow: load all chain caches in parallel, migrate legacy in‑state cache to per‑chain storage; guard with mutex and an initialization promise.
    • Save only the updated chain after fetch; make clearingTokenListData() async to wipe state and remove all cache keys; add error handling on network change and storage ops.
    • Refactors: private fields (#intervalId, #chainId, etc.), export DataCache, minor type tweaks.
  • Tests:
    • Extensive new tests mocking StorageService for load/save/clear, migration, error paths; update calls to await clearingTokenListData().
  • Build/Deps:
    • Add dependency @metamask/storage-service; add TypeScript project references; update yarn.lock.
  • Changelog:
    • Document new per‑chain persistence and migration notes.

Written by Cursor Bugbot for commit 9332b6e. This will update automatically on new commits. Configure here.

tokensChainsCache: {
includeInStateLogs: false,
persist: true,
persist: false, // Persisted separately via StorageService
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making this false to block disk writes

'TokenListController: Failed to load cache from storage:',
error,
);
});
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Race condition causes fresh API data loss

The constructor starts #loadCacheFromStorage() as a fire-and-forget async operation without using the mutex. Since #loadCacheFromStorage replaces the entire tokensChainsCache state object (state.tokensChainsCache = loadedCache), while fetchTokenList only updates a single chain, there's a race condition. If start() or fetchTokenList() is called immediately after construction and completes before the async storage load finishes, the subsequent storage load will overwrite the fresh API data with old cached data, causing data loss. This also means cache validity checks may incorrectly return false before storage loading completes, triggering unnecessary API calls.

Additional Locations (1)

Fix in Cursor Fix in Web

'TokenListController: Failed to clear cache from storage:',
error,
);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Race condition between clear and fetch causes storage inconsistency

clearingTokenListData performs async storage operations without mutex protection, while fetchTokenList uses #mutex. When clearingTokenListData is called fire-and-forget from #onNetworkControllerStateChange, state clears immediately, then getAllKeys runs asynchronously. If fetchTokenList executes during this window, it saves new data to storage. When getAllKeys returns, it may include the newly-saved key, which then gets deleted. This leaves state and storage inconsistent - state has data, storage doesn't - causing data loss on subsequent app restarts.

Additional Locations (1)

Fix in Cursor Fix in Web

Object.keys(dataToMigrate).map((chainId) =>
this.#saveChainCacheToStorage(chainId as Hex),
),
);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Single-file storage migration loses data due to state mismatch

When migrating from old single-file storage, dataToMigrate is set to oldStorageData, but #saveChainCacheToStorage reads chain data from this.state.tokensChainsCache[chainId] instead of the migration source. Since #loadCacheFromStorage only loads per-chain files (keys with : suffix), the old single-file data is never loaded into state. The migration silently fails (logging warnings), then the old storage is deleted at lines 386-392, causing permanent data loss.

Additional Locations (1)

Fix in Cursor Fix in Web

Object.keys(dataToMigrate).map((chainId) =>
this.#saveChainCacheToStorage(chainId as Hex),
),
);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Migration from single-file storage loses all cached data

When migrating from old single-file storage format, dataToMigrate is set to oldStorageData but this data is never loaded into this.state.tokensChainsCache. The migration then calls #saveChainCacheToStorage(chainId) for each chain, but that method reads from this.state.tokensChainsCache[chainId] which is empty/undefined. The method logs a warning and returns without saving anything. After this failed "migration", the old single-file storage is deleted at lines 386-392, resulting in complete data loss. The dataToMigrate variable is never actually used to save the data it contains.

Additional Locations (1)

Fix in Cursor Fix in Web

this.update((state) => {
state.tokensChainsCache = loadedCache;
});
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Race condition allows async loading to overwrite fresh data

The #loadCacheFromStorage() method runs as fire-and-forget in the constructor without using the #mutex that protects fetchTokenList(). Since loading completes asynchronously and performs a full cache replacement (state.tokensChainsCache = loadedCache), if fetchTokenList() is called immediately after construction and completes before loading finishes, the freshly fetched API data will be silently overwritten with potentially stale cached data. The async loading and other state-modifying operations can run concurrently with no synchronization.

Additional Locations (1)

Fix in Cursor Fix in Web

this.clearingTokenListData();
this.clearingTokenListData().catch((error) => {
console.error('Failed to clear token list data:', error);
});
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Race condition in async clear during network change

The clearingTokenListData() method changed from synchronous to async, but is called without await in #onNetworkControllerStateChange. This creates a potential race condition: the method immediately clears state, then asynchronously fetches all storage keys to delete. If a concurrent fetchTokenList() call saves data to storage between when getAllKeys returns and when keys are deleted, that newly-saved data could be inadvertently deleted. While state clearing is still synchronous, storage and state could become inconsistent if operations interleave during this async window.

Additional Locations (1)

Fix in Cursor Fix in Web


if (hasPerChainFiles) {
return; // Already migrated
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Partial migration failure causes permanent cache data loss

The migration check uses allKeys.some() to determine if migration is complete, returning early if ANY per-chain file exists. However, #saveChainCacheToStorage catches errors internally without re-throwing, so if migration partially fails (e.g., chains A and B exist in state but only A saves successfully), the migration appears complete. On subsequent launches, hasPerChainFiles returns true because chain A exists, so migration is skipped entirely. Combined with persist: false on the state metadata, chain B's cache data is permanently lost since the old framework-managed state is no longer saved and migration won't retry for missing chains.

Additional Locations (1)

Fix in Cursor Fix in Web


### Changed

- `TokenListController` now persists `tokensChainsCache` via `StorageService` using per-chain files to reduce write amplification ([#7413](https://github.com/MetaMask/core/pull/7413))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!!

Should we mark this as breaking since we are adding a new action?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be a breaking change ?

salimtb
salimtb previously approved these changes Dec 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants