How to develop securely on Soroban? Storage types with unbounded data

Veridise
Veridise
Published in
5 min readMar 15, 2024

--

Introducing Soroban

Soroban is a new smart contract platform that seamlessly integrates into the existing Stellar blockchain. Soroban is independent of Stellar, though, and can be used by any transaction processor, including other blockchains. Stellar is one of the earliest blockchain implementations with 9-year history.

Soroban was launched in March 2023, and at the same time, a $100M adoption fund was announced to support developers building on Soroban.

The Soroban platform includes the smart contract environment and a Rust SDK that can be used to write smart contracts. Specifically, the smart contract environment utilizes the WebAssembly (Wasm) virtual machine, and contract code is compiled into Wasm bytecode. This allows for the potential support of other languages in the future besides Rust.

Soroban introduced ‘footprints’, or transaction dependencies. This feature allows for the grouping of transactions and their concurrent execution. It enables parallel computation and leverages modern multi-core hardware.

How to avoid DoS risk with unbounded data

At Veridise, we’ve audited DApps built on Soroban and encountered common pitfalls. We’d like to share these observations with the Soroban developer community.

In this blog post, we focus on “Storage Layout” considerations in Soroban, specifically addressing how to avoid DoS risk with unbounded data

We decided to write about this topic because we’ve seen the same mistakes made by several developers building on Soroban.

To be clear, this blog post is not intended to be a comprehensive ‘security best practices’ article. There are other considerations which we might publish later in a separate article.

Let’s get down to this topic and start with a general introduction on Soroban’s storage layout.

Storage Layout on Soroban

Let’s examine the three storage types available on Soroban:

Three storage types:

The first type, Temporary Storage, is exactly that — temporary. When the data expires it is deleted from the ledger permanently. It’s suitable, for example, for storing a recent price of a token against the US dollar for a short amount of time.

This is different from, for example, Ethereum, which currently does not have the concept of temporary storage (storage that expires), while core devs are considering to introduce something similar.

The last two types (Instance and Persistent Storage) can always be recovered even if they are deleted from the ledger.

The main difference between them is that Instance storage has a limited amount of storage available and the data saved there is directly attached to the contract instance. Making it suitable, for example, for storing global contract data like metadata or admin accounts.

Instance Storage should not be used with unbounded data

A unique aspect of Instance Storage is that its entire content is loaded every time you interact with a contract. You cannot avoid loading the content, even if you’re not using any ledger data. This can lead to more expensive function invocations as the stored data grows over time, both computationally and financially. Instance Storage allows for 64 KB of data storage.

Therefore, there’s a security consideration with unbounded data.

Take for example the pattern of a factory contract, which is a contract designed to deploy more contracts. If this contract stores a new identifier each time it issues a new contract, this variable is unbounded in nature and will continuously increase over time.

This should not be saved in Instance Storage.

If you have information that keeps incrementing without a limit, you should always use Persistent Storage (e.g. the balance of each user).

To summarize, and to quote the Soroban docs, Instance Storage should be used for small data directly associated with the current contract, such as its admin, configuration settings, tokens the contract operates on etc. Instance Storage should not be used with any data that can scale in unbounded fashion (such as user balances).

Example 1 — How *not* to use Instance Storage

Take for example the following function from a factory contract. The function is used to deploy new contracts using the concatenation of token_a and token_b as the salt.

Each time a contract is created, the function create_liquidity_pool_1 saves the address of the recently created contract into the Instance Storage of the contract using as key the tuple formed by token_a and token_b. Hence, the Instance Storage grows each time a new contract is deployed.

Given that the Instance Storage is loaded completely every time the contract is invoked, using unbounded data will cause the invocations to become more and more expensive over time until a DoS state is reached.

fn create_liquidity_pool_1 (
env: Env,
lp_init_info: LiquidityPoolInitInfo,
lp_wasm_hash: BytesN<32>,
token_a: Address,
token_b: Address
) {
let mut salt = Bytes::new(env);
salt.append(&token_a.to_xdr(env));
salt.append(&token_b.to_xdr(env));

let lp_contract_address = env.deployer()
.with_current_contract(salt)
.deploy(lp_wasm_hash);

env.storage().instance().set(
&PairTupleKey {
token_a: token_a.clone(),
token_b: token_b.clone(),
},
&lp_address,
);
}

Persistent storage may also be vulnerable to unbounded data

Persistent storage can also pose DoS risks. Let’s examine another example.

Example 2 — How *not* to use Persistent Storage

Take, for example, the following function, which has the same purpose as the previous one.

The two differences are:

(1) It uses Persistent Storage rather than Instance Storage.
(2) It saves the address of each newly created contract into a vector.

The issue is similar to the previous function. With each call of this function, create_liquidity_pool_2, the vector lp_vec grows in size. When the associated Ledger Entry reaches its maximum size, it will no longer be possible to create new pools.

fn create_liquidity_pool_2 (
env: Env,
lp_init_info: LiquidityPoolInitInfo,
lp_wasm_hash: BytesN<32>,
token_a: Address,
token_b: Address
) {
let mut salt = Bytes::new(env);
salt.append(&token_a.to_xdr(env));
salt.append(&token_b.to_xdr(env));

let lp_contract_address = env.deployer()
.with_current_contract(salt)
.deploy(lp_wasm_hash);

let mut lp_vec = env.storage()
.persistent()
.get(&DataKey::LpVec)
.expect("Liquidity Pool vector not found");

lp_vec.push_back(lp_contract_address.clone());

env.storage().persistent().set(&DataKey::LpVec, &lp_vec);
}

Finally

To summarize, Soroban comes with three storage types: Temporary Storage, Persistent Storage and Instance Storage. Persistent and Instance Storage persist on the ledger permanently which makes them prone to DoS risks if they are not used properly.

For Persistent Storage we recommend saving the data into different storage slots instead of using one slot. For example, instead of saving the newly created contracts into the same vector, as was shown in create_liquidity_pool_2, you could save them into different slots by using token_a and token_b as the key, similar on how is done in create_liquidity_pool_1.

However, using the above approach, as was shown in the first example, does not work when saving data into the Instance Storage because even if you save the data into different slots all of them get loaded when interacting with the contract due to the fact that they are attached on the contract instance. When using the Instance Storage be sure that the data saved into it cannot grown in an unbounded fashion.

Resources:

Author: Alberto Gonzalez

Want to learn more about Veridise?

Twitter | Lens | LinkedIn | Github | Request Audit

--

--

Veridise
Veridise

Veridise is your trusted blockchain security partner. Security audits for ZK, DeFi, NFTs, blockchains, dApps, Layer2s & more