Ethereum ERC-20 Meta-Transactions

In this blog, Pascal explores the concept of gasless transactions - also called meta-transactions - in the context of ERC-20 tokens.
Pascal Marco Caversaccio
Pascal Marco Caversaccio
October 22nd, 2021

Sometimes ETH gas fees suck, I have to put it that way. 
It's already troublesome, I put an NFT up for sale on OpenSea, and I want to accept the offer, which costs me 120 USD Gas Fees. But unfortunately I only have 50 USD ETH in my wallet at the moment. 

Or another example is that I want to swap tokens on Uniswap, tokens which have a value of several hundred USD, but the gas fees are higher than the amount I have in my wallet. So to accept my NFT sale I would have to pay over 120 USD, which I don't have in ETH and would have to buy first.  This is bullshit and unfortunately makes no sense for many usecases. 

We also develop wallets that enable transactions for a limited group of users. In such cases, it can make a lot of sense to waive gas fees by implementing meta-transactions.

The gas challenge on Ethereum

“Oh boy, a really cool project airdropped a token on my Ethereum wallet and now unfortunately I cannot transfer it to my main wallet because I do not have any ether on this address. That really sucks!”

If this sounds familiar to you, you can skip the next section. However, if the whole thing sounds cryptic to you, here is a brief explanation of the gas challenge on Ethereum.

Gas is pivotal in the Ethereum world. To make the analogy, it is the fuel that allows it to operate, in the same way, that a car needs gasoline to run. In summary, gas refers to the unit that measures the amount of computational effort required to execute specific operations (e.g. sending ether (ETH) or a token) on the Ethereum network. Since each Ethereum transaction requires computational resources to execute, each transaction requires a fee (=commonly known as the multiple of gas price and gas consumed by the transaction). Gas fees are paid in Ethereum's native currency ETH. Also, gas prices are usually denoted in a unit called gwei, which itself is a denomination of ETH – each gwei is equal to 0.000000001 ETH (or 10-9 ETH). For instance, instead of saying that your gas costs 0.000000001 ETH, you would usually say that your gas costs 1 gwei. The word gwei itself means giga-wei and it is equal to 1’000’000’000 wei (1 wei = 10-18 ETH). Wei itself – named after Wei Dai, the inventor of b-money – is the smallest unit of ETH.
But why do gas fees actually exist? In short, gas fees help keep the Ethereum network secure. By requiring a fee for every computation executed on the network, we prevent bad actors from spamming the network. For a further deep-dive into this topic, I can recommend this reference.

So what we have learned so far is that every (write) interaction on the Ethereum blockchain requires a small amount of ETH on the interacting address. This sounds really awful from a UX perspective for token holders, as users first need to acquire ETH via a centralised exchange and transfer it to the wallet address accordingly. But wait, isn't it the case – very simplified – that at the most foundational blockchain level, it is simply a matter of verifying the signed payload, i.e. off-chain cryptography? Ah yes, that sounds right! So how about the wallet user simply signs the payload off-chain and someone else (e.g. an operator) broadcasts and pays for the transaction? There you go, we have the solution: meta-transactions.

Background

A meta-transaction is a regular Ethereum transaction that contains another transaction, the actual transaction. The actual transaction is signed by a user and then sent to an operator or something similar; no gas and blockchain interaction is required. The operator takes this signed transaction and submits it to the blockchain paying for the fees himself. The forwarding smart contract ensures there is a valid signature on the actual transaction and then executes it.

In the context of ERC-20 token transfers, we must also be aware of the following important governance layer: Arguably one of the main reasons for the success of ERC-20 tokens lies in the interplay between approve and transferFrom, which allows for tokens to not only be transferred between externally owned accounts (EOA) but also to be used in other contracts under application-specific conditions by abstracting away msg.sender as the defining mechanism for token access control.

However, a limiting factor in this design stems from the fact that the ERC-20 approve function itself is defined in terms of msg.sender. This means that the user's initial action involving ERC-20 tokens must be performed by an EOA. If the user needs to interact with a smart contract, then they need to make two transactions (approve and the smart contract call which will internally call transferFrom). Even in the simple use case of paying another person, they need to hold ETH to pay for transaction gas costs.

To resolve this challenge, we can extend the ERC-20 token contract with a new function permit, which allows users to modify the allowance mapping using a signed message (via secp256k1 signatures), instead of through msg.sender. Or in other words, the permit method can be used to change an account's ERC-20 allowance (see IERC20.allowance) by presenting a message signed by the account. By not relying on IERC20.approve, the token holder account does not need to send a transaction and thus is not required to hold ETH at all.

For an improved user experience, the signed data is structured following EIP-712, which already has widespread adoption in major RPC & wallet providers.

This setup leads us to the following architecture:

/assets/1-img/content/ETH20-architecture.png

EIP-712 - Ethereum Typed Structured Data Hashing and Signing

EIP-712 is a standard for hashing and signing of typed structured data. The encoding specified in the EIP is very generic, and such a generic implementation in Solidity is not feasible, thus our contract does not implement the encoding itself. Protocols need to implement the type-specific encoding they need in their contracts using a combination of abi.encode and keccak256.

Our example smart contract denoted as Forwarder.sol implements the EIP-712 domain separator (_domainSeparatorV4) that is used as part of the encoding scheme, and the final step of the encoding to obtain the message digest that is then signed via ECDSA (hashTypedDataV4).

The OpenZeppelin implementation of the domain separator was designed to be as efficient as possible while still properly updating the chain ID to protect against replay attacks on an eventual fork of the chain.

Note: The smart contract Forwarder.sol implements the version of the encoding known as "v4", as implemented by the JSON RPC method eth_signTypedDataV4 in MetaMask.

Forwarder Contract - A Smart Contract for Extensible Meta-Transaction Forwarding on Ethereum

The smart contract Forwarder.sol extends the EIP-2770 and entails the following core functions:

  • verify: Verifies the signature based on the typed structured data.

function verify(ForwardRequest calldata req, bytes calldata signature) public view returns (bool) {
        address signer = _hashTypedDataV4(keccak256(abi.encode(
            _TYPEHASH,
            req.from,
            req.to,
            req.value,
            req.gas,
            req.nonce,
            keccak256(req.data)
        ))).recover(signature);
        return _nonces[req.from] == req.nonce && signer == req.from;
    }
  • execute: Executes the meta-transaction via a low-level call.

function execute(ForwardRequest calldata req, bytes calldata signature) public payable whenNotPaused() returns (bool, bytes memory) {
        require(_senderWhitelist[msg.sender], "Forwarder: sender of meta-transaction is not whitelisted");
        require(verify(req, signature), "Forwarder: signature does not match request");
        _nonces[req.from] = req.nonce + 1;
 
        (bool success, bytes memory returndata) = req.to.call{gas: req.gas, value: req.value}(abi.encodePacked(req.data, req.from));
        
        if (!success) {
            assembly {
            returndatacopy(0, 0, returndatasize())
            revert(0, returndatasize())
            }
        }
 
        assert(gasleft() > req.gas / 63);
 
        emit MetaTransactionExecuted(req.from, req.to, req.data);
 
        return (success, returndata);
    }

The full smart contract can be found here.

Security Considerations

In order to assure a replay protection, we track on-chain a nonce mapping. Further, to prevent anyone from broadcasting transactions that have a potential malicious intent, the Forwarder.sol smart contract implements a whitelist for the execute function. Also, the smart contract is Ownable which provides a basic access control mechanism, where there is an EOA (an owner) that is granted exclusive access to specific functions (i.e. addSenderToWhitelist, removeSenderFromWhitelist, killForwarder, pause, unpause). Further, the smart contract function execute is Pausable, i.e. implements an emergency stop mechanism that can be triggered by the owner. Eventually, as an emergency backup, a selfdestruct operation is implemented via the function killForwarder.

Note 1: It is of utmost importance that the whitelisted EOAs carefully check the encoded (user-signed) calldata before sending the transaction.

Note 2: calldata is where data from external calls to functions is stored. Functions can be called internally, e.g. from within the contract, or externally. When a function's visibility is external, only external contracts can call that function. When such an external call happens, the data of that call is stored in calldata.


Note 3: For the functions addSenderToWhitelist and killForwarder we do not implement a dedicated strict policy to never allow the zero address 0x0000000000000000000000000000000000000000. The reason for this is that firstly, the functions are protected by being Ownable and secondly, it can be argued that addresses like 0x00000000000000000000000000000000000001 are just as dangerous, but we do nothing about it.

Signed User Data (Input Parameters) for permit and execute

I have prepared a sample JavaScript-based script that can be accessed here.

Note: The first four bytes of the calldata for a function call specifies the function to be called. It is the first (left, high-order in big-endian) four bytes of the keccak256 hash of the signature of the function. Thus, since 1 nibble (4 bits) can be represented by one hex digit, we have 4 bytes = 8 hex digits.

Final Remarks

It is important to stress that the entire codebase has not been audited, so use it at your own risk (and test it first on one of the live test networks). Also, one could argue that some of the custom functions like killForwarder are unnecessary due to the pausable functionality. I make no claim for a perfect smart contract here, but rather provide a good playground where enough security aspects are taken into account (such as replay protection). In addition, a complete suite of unit tests is available here that achieve a test coverage of 100%. What is very important, however, is that a test coverage of 100% does not mean that there are no vulnerabilities. What really counts is the quality and spectrum of the tests themselves. Eventually, the approach described above is by no means the end of the road, but there are potential vulnerabilities, for example regarding the decentralisation of the paymaster. A project that solves this issue and implements an advanced implementation architecture is Ethereum Gas Station Network (GSN). Go and check it out!