Skip to main content

· 4 min read
TLDR;

Human-readability of Ethereum Transactions is a multi-faceted and complex problem that requires ecosystem-wide collaboration. Therefore, it makes sense to create a working group to gather people, projects, and knowledge.

Motivation

It is a well-known UX problem in Ethereum that users usually don't/can't verify the action they are about to take, because they are not presented with human-readable information. This has led to social engineering hacks where victims lost millions. In one case, a hacker was able to replace the browser wallet, which made the victim sign a transfer transaction on his HW wallet that sends all the tokens to the hacker. In another, the hacker created an offline signature for the victim to list all his NFTs for free.

As a basic example, our goal is to show something similar to the one on the right rather than on the left.

Bytecode vs Human-Readable Tx

Nowadays, many wallets can do the basic ABI decoding and show a verified contract link but users still lack a description of the action they are about to take and additional safety information about the contract they are going to interact with.

How we achieve this at Sourcify is through the NatSpec documentation. If you document your code using NatSpec's @notice and @dev fields and fully verify your contract on Sourcify, the wallet can show the users the description you wrote when calling the function. (details in this talk at Devcon VI or this lightning talk).

Over time it became clear to me that even if we convince the majority of developers to document using NatSpec and fully verify on Sourcify, this single route won't solve this wicked problem of Human-readable Transactions. The problem is multi-faceted and requires different approaches for different cases. For instance, you can't add NatSpec docs to an already deployed contract, or you can't use Dynamic Expressions for a commit-reveal transaction (e.g. ENS commit).

Actually, there are different approaches, some of which we gathered in the Sourcify docs. Unfortunately, most of them seem to be stale.

Another motivation for us has been the lack of knowledge of what's going on in the space. Even though we were working on this problem, we haven't been aware of the following for a long time:

Solving this problem of transaction human readability is hard and is and requires ecosystem-wide collaboration.

For this reason, it makes sense to form a "Human-Readable Transactions Working Group" focused on this specific problem with different interested parties

Scope

How do we define the scope?

Our starting point is the human-readability of the transactions but this really cannot be separated from the safety, UX and human-friendliness. Depending on the progress, other UX and safety aspects are expected to be included in the general work (audits, token registries etc.). Initially, it's called “human-readable tx's WG”, but we'll see where it goes.

The work will mostly be on EVM, but not specific to the Ethereum network.

Goals

  • 🎯 Being the Schelling Point: Gather different parties working on the transaction readability, security, and UX in the same place. Enable collaboration between parties, and make sure everyone knows who's working on what.
  • 📚 Being the knowledge base: Discuss and compile the different approaches to the problem. Lay out the advantages and disadvantages of different methods. Document them for the public.
  • 🌟 Open-source the solutions to solve it once and for all.

The goal, however, is not to work on a single agreed solution to the problem. As said, there is no single solution to this problem due to its complexity and context dependence. Likely, there will be conflicts and forks, and each team will focus on what they think is the best way. Ideation and active feedback should allow us to reach the best solutions faster.

Structure

This is also a TBD but one potential place for this WG is CASA.

Interested?

Are you working on similar problems and want to collaborate? Reach out to me on Twitter @kaanuzdogan, Matrix @kuzdogan:matrix.org, or Telegram (@kuzdogan)!

· 4 min read

Today we released Sourcify v2 🎉

The changes do not affect the Sourcify Server API in a non-backwards compatible way. If you are using the Sourcify API you don't need to worry. However are some non-breaking additions detailed below.

Why is this a major update then?

Motivation

The motivation for these changes is to make Sourcify verification more reusable. The lib-sourcify package can be imported into other projects and verify a contract given the source files, and chain&address. Another goal was to create modularity in the codebase with more separated concerns. With these changes, Sourcify server consumes the core lib-sourcify functionality, and takes care of the rest: providing an API, validating inputs, and storing the results (in the repo) etc.

This is in line with what we want to achieve with edge verification. We beleive a contract verification should be easily reproducable and you should be able to verify contracts locally without relying on a third party.

Imagine you're interacting with a contract on your wallet. Before you sign a transaction your wallet:

  • fetches the contract's source code from IPFS
  • compiles and verifies with lib-sourcify

without even talking to Sourcify or any other verifier, everything happens on your local machine. Similarly a block explorer like Otterscan can give its users the option to either fetch the verified source code directly from a verifier (like Sourcify), or verify the contract locally on the frontend.

However, the library as is it not compatible with browsers yet and we are working on it. If you are knowledgable on this front and want to help us, please reach us out.

lib-sourcify

The brand new @ethereum-sourcify/lib-sourcify is the library that will do all the weightlifting of assembling a contract (e.g. source files) into a compilable CheckedContract, compiling, and verifying it. You can pass checkFiles your contract source code and metadata.json to pack compilable CheckedContracts.

const pathBuffers: PathBuffer[] = [];
pathBuffers.push({
path: filePath,
buffer: fs.readFileSync(filePath),
});
const checkedContracts: CheckedContract[] = await checkFiles(pathBuffers);

Then you can verify this CheckedContract against a contract that is deployed on a chain at an address.

const goerliChain =   {
name: "Goerli",
rpc: [
"https://locahlhost:8545/"
"https://goerli.infura.io/v3/${INFURA_API_KEY}",
],
chainId: 5,
},

const match = await verifyDeployed(
checkedContract[0],
goerliChain,
'0x00878Ac0D6B8d981ae72BA7cDC967eA0Fae69df4'
)

console.log(match.status) // 'perfect'

Context Variables

In previous versions we introduced "verification with simulation", that is, executing the contract's compiled creation bytecode on an EVM to see if it will yield the same onchain deployed bytecode at the given chainId & address. This allows us to verify "contracts with immutables created by factory contracts".

With this we let the user pass contextVariables to the "simulation". These variables are useful when they are used as a value to an immutable variable in the contract's constructor. Currently you can pass two types of variables:

const contextVariables = {
msgSender: '0x00878Ac0D6B8d981ae72BA7cDC967eA0Fae69df4',
abiEncodedConstructorArguments: "0x0000000000000000000000000000000000000000000000000000000000038efe",
}

const match = await verifyDeployed(
checkedContract[0],
goerliChain,
'0x00878Ac0D6B8d981ae72BA7cDC967eA0Fae69df4'
contextVariables
)

(In the server API, find the field contextVariables)

For example, if you have a contract like this:

contract Child2{
address immutable public owner;
constructor(){
owner = msg.sender;
}
}

passing the msgSender variable allows lib-sourcify to assign this variable inside the bytecode, and be able to verify, because it's an immutable variable. If there are other potential variables that are used as a value to immutables in contracts, please let us know.

Creator Tx Hash

We can also verify contracts by looking at the tx.input of the transaction that created the contract. If this matches the creation bytecode of the compiled contract AND the address resulting from the tx.from and tx.nonce matches the given address, we can verify the contract.

const match = await verifyDeployed(
checkedContract[0],
goerliChain,
"0x00878Ac0D6B8d981ae72BA7cDC967eA0Fae69df4".undefined,
"0xe75fb554e433e03763a1560646ee22dcb74e5274b34c5ad644e7c0f619a7e1d0" //tx hash
);

(In the server API, find the field creatorTxHash)

CREATE2

You can also verify CREATE2 created contracts:

const match = await verifyCreate2(
checkedContract[0],
deployerAddress,
salt,
create2Address,
abiEncodedConstructorArguments
);

console.log(match.chainId); // '0'. create2 matches return 0 as chainId
console.log(match.status); // 'perfect'

Questions? Feedback?

As usualy feel free to reach us out on Twitter, Matrix chat, or Gitter.

✅ Happy verifying!

· 7 min read

In an ecosystem with the core values of transparency, security, and trust (and trustlessness); it is expected from all contract developers to publish their source code. If you're even slightly familiar with Ethereum, there is no need for further explaination.

But if I give you a source code, how do you make sure the published source code really is the source code of the contract? That's where source code verification comes into play.

note

Throughout this article and 99% of the time in Sourcify context, by verification we will be referring to smart contract verification. Verification sometimes also refers to formal verification.

What is source code verification?

First thing first, all the smart contracts on blockchain are stored in bytecode. Just like our physical machines that only speak bits and bytes, Ethereum Virtual Machine also only understands bytes. If you ask the Ethereum blockchain the code of a contract, you only get a byte string.

So, let's say I give you a contract in Solidity and claim that this is the code behind the contract at "0xabcdef...". To verify, you need to make sure this code compiles to the same bytecode as the claimed contract at "0xabcdef...". This is the basic idea behind the smart contract verification: we compile a contract and check if the bytecode matches the one on blockchain.

Visualization of the compilation of a contract Checking if bytecodes match

You have probably made use of contract verification before. For many users this is the green checkmark in Etherscan:

Green checkmark in a verified contract page on Etherscan

You see the green checkmark and you are happy!

But is it really exactly the same code that is deployed?

The answer is, you don't know 🤷

In fact, no one else would be able to know except the contract developer, and he/she can't really prove it. The reason is, when compiling the contract i.e. translating the human-readable source code (in Solidity or any other higher-level language) to machine-readable bytecode, some information is lost. These include internal variable names, internal function names, names of contracts etc.

So yes this is functionally the same code as deployed: it compiles to the same bytecode as the original mysterious source code 🕵.

And you might be thinking, sure this is good enough. But:

  • Someone can insert misleading comments, (internal) function or variable names
  • Whoever verifies a contract first is chosen as the matching result, not the "authentic" one
  • We can't verify things other than the contract's code itself (i.e. metadata)

In fact when not verified properly, it is possible to inject code that would be shown in the verified source code.

Enough bad news... There's actually a way to verify Solidity contracts that would cryptographically ensure the exactness of the source files and it is already here: It's called Sourcify!

This way of verifying contracts is what we call a perfect verification, (in contrast to partial verification). This is enabled by the Solidity contract metadata, and that the hash of it is appended to the contract's bytecode. The metadata hash acts as a fingerprint of the whole compilation and with the information in the metadata file we can completely reproduce the contract compilation.

Contract Metadata

The Solidity compiler by default appends some information to the contract's bytecode in CBOR encoding. This special field, I like referring to as auxdata, usually contains the "Solidity version", the "metadata hash", and occasionally the "experimental" flag. The encoded data and it's decoding looks like this:

Decoding of the auxdata appended to the bytecode

You can actually inspect this field and see the decoding in action for any contract in playground.sourcify.dev.

To see how this cryptographically ensures the exactness of the source files we need to look into the contents of the metadata file. The metadata file is a JSON document that looks like this and contains information on two things:

  1. How to interact with the contract: ABI, documentation
  2. How to reproduce a contract compilation: compiler version and settings, source file information

The latter is the relevant field for our purposes. Specifically, the fact that the metadata file contains source file hashes. To illustrate this, let's walk through what happens when you compile a contract and what happens when you change a source file.

When you compile a contract, the compiler computes the hashes of the source files and embeds this information in the metadata file. On the right side, you see the relevant fields of the metadata file:

Embedding of the hash of the source files inside the metadata file

Then the compiler takes the hash of this whole file:

Taking the IPFS hash of the metadata file

And encodes it in the auxdata at the end of the bytecode:

Encoding of the metadata hash at the end of the bytecode

So if you were to decode the auxdata you'd see:

Decoding of the metadata hash at the end of the bytecode

What happens when we change something in the source files? Say we change a variable name or a comment in the new MyContract-diff.sol file. In turn the hash of the file changes, as well as the hash in the metadata:

The change of the hash when the source file changes

...and of course the hash of the metadata file changes:

The change of the hash when the metadata file changes

...and the auxdata changes:

The change of the auxdata when the hash changes

Sooo, if we match both the bytecode + the appended auxdata, we have byte-by-byte exactly the same source code and compilation settings of the original deployed contract. This is a perfect verification.

The perfect verification

If the bytecode matches but not the auxdata (which includes the metadata hash), we have a partial verification.

The partial verification

Did you notice?

If you are familiar with IPFS and paid attention, you might ask: Can't we already get everything from the bytecode itself?

And yes, if published on IPFS, you can actually fetch the source code from the bytecode of a contract, because all the information is already there:

  • The metadata IPFS hash is appended to the bytecode so (if published) you can fetch the metadata file.
  • The metadata file contains (alongside the normal keccak256) the IPFS hashes of the source files so you can fetch the complete source code from IPFS.

So there's only one thing that you need to do as a contract developer: Publish your source files and metadata on IPFS.

Why do you need verification then? Isn't the source file already out there?

Although unlikely since the compiler does it automatically, someone can change the auxdata of the contract before deploying it and show you a different random source code. We make sure it really is the same code by doing a whole recompilation of the provided files and comparing the resulting bytecodes. Plus, we share all verified contracts in our repository on IPFS to make sure it's available.

Conclusion

Perfect verification enables more secure and transparent verification on contracts, as well as other useful things such as decoding tx's and enabling human-readable contract interactions, but this is a topic for another article.

Next level smart contract verification is already here. We just need to adopt this way of verifying contracts as a community. Obviously, we need a lot of tooling, integrations, and more awareness. Let's step up and make this the standard way of verifying contracts!

(This article is a summary of my recent talks about Sourcify. If you are interested in learning more, check out one of the latest talks)