Skip to main content

We tried to fix blind signing, here's what we learned

· 17 min read

In the earlier days of Sourcify, there were other things we were focusing on besides source-code verification. Sourcify was there to foster the adoption of the Solidity metadata and to make transactions human-readable.

Since September 2021, I've been the only one working on the project until Marco joined in July 2022 and Manuel in April 2024. You will see in our earlier talks we start with the problem statement of "YOLO signing" and explain how Sourcify, Solidity metadata.json, and NatSpec can solve this. That talk is actually my first ever conference talk (can be seen from how nervous I am), and I was naive. The idea was that if you document your Solidity methods with NatSpec, this userdoc/devdoc will be in the metadata.json and you can use these human-readable messages to tell the user what is going to happen before calling this contract method (more below).

Old Sourcify website showing human-readable transactions

The old Sourcify website highlighting human-readable transactions as a key feature

Getting more familiar with the problem, soon I realized this huge problem can't be solved with metadata.json + NatSpec only. We had to attack the problem from different angles for different cases. This led us to start the initiative "Human-Readable Transactions Working Group" (kudos to Dustin and Karen for the help) with actually many people ranging from Safe, Metamask, Coinbase Wallet etc. We held multiple calls but unfortunately we as Sourcify couldn't continue our coordinating role and the initiative died around August 2023. Marco and I didn't have enough bandwidth and experience, and had to go back to focusing on our core value proposition of making source-code verification open-source and the verified contract data open and accessible.

Slaying blind signing meme

The problem of blind-signing turned out to be a bigger dragon than we can slay.

Still we learned a lot and I got to see what people have been proposing to solve this. Following the ByBit hack that was essentially caused by blind-signing, I'll take this chance to gather everything we've learned here and share with everyone. There have been countless cases of people losing significant amounts of money because of blind signing. But the ByBit hack, the biggest hack ever in crypto history, with 1.4 billion USD stolen, has brought the attention back to this issue.

I started this article with blind-signing in mind but the problem is actually even larger: it's transaction safety. It means nothing if you can read a transaction but it does not do what you read or if you're talking to the wrong contract. It's layers and layers and layers of security. Which also means layers and layers and layers of coordination effort. That's why it's so difficult.


By no means this is a complete guide. I'm more than happy to extend this. Please get in touch if you have anything to add!

1. Source Code Verification

This is a MUST. Period. Without verification there's no way to tell what a contract really does.


The contract should be verified by a verifier, but ideally it is verified at multiple places. Even better if the verification is a "perfect/exact match", this gives you a cryptographic guarantee that what you see is exactly what the contract author deployed, even the comments, whitespaces etc.

Players here: Blockscout, Etherscan, Routescan, Tenderly, Sourcify, and the Verified Alliance initiative.

Wallets MUST warn users about unverified contracts and use multiple sources to check verification. Unfortunately, even in 2025, not many do even this simple requirement.

The tricky thing here is to not lead the user to believe verified == safe. A verified contract can still be malicious and verification does not check what the contract does. I actually wish we had a better name like "open-sourced contracts" instead of verification, but it is what it is.

2. Labeling

The second step is to make sure you are talking to the correct contract. You think you are making a swap on Uniswap but is it really Uniswap or just a verified Uniswap copy at another address? Maybe the contract you are talking to is already marked malicious and it'd be such a shame to lose money to an already known scammer.

So this entails whitelists and blacklists of contracts. I don't have deep knowledge around this but I don't really recall seeing contract labels in my wallets. There are "web3 security" companies or Etherscan labels but the problem with these is everyone solves it for themselves.

I'll be saying this many times:

In order to solve this problem once and for all, THE SOLUTION HAS TO BE OPEN.

Closed, siloed data will only get us so far.

Here the project I want to highlight is the Open Labels Initiative stewarded by If there are other solutions/projects worth sharing, please share them.

3. Audits

This is part of labeling but it deserves its own mentioning. We need open, public audit lists of contracts. Wallets need to show if the contract you are about to interact with is audited and if not warn you against this.

The second step could be to create a ranking/reputation system of auditors, as you all know, not all auditors are created equal. And again, all these lists MUST be open and public. Note that I don't mean "decentralized" necessarily. A good enough centralized solution is better than nothing as long as the data is fully open, not monetized, and incentives are aligned to keep it that way. This can be an alliance of multiple parties, or a project sustained by grants. As you can see, this by itself is already a big enough problem to solve.

This is also not a part of the ecosystem I'm not too knowledgeable in so please share what you know.

4. Human-Understandability of Transactions

I named this "understandability" instead of "readability" because readability is just a necessary but non-sufficient part of our main goal i.e., being able to understand what you are doing. Reading swapTokensForExactTokens is quite a meaningless word for a normal person, even though readable. If you have no idea what you're doing, you're basically giving out your money.

POV signing transactions

Dog-readable transactions, anyone?

4.1 ABI-Decoding

It's year 2025 and I can't believe I have to mention this.

By now it should be trivial to take the ABI JSON of a contract (verification required), and decode the calldata.

Most of the time this is not really helpful to the user though.

Tx 0x5a45979e2a4a22855a1a2aee1ecb346b4ecee2d4e498817e8ed98d86476809ce on Mainnet

If you're given only the raw calldata:


A really simple decoding yields a much more meaningful message (decoded in Blockscout):

Decoded calldata

This is incredibly straightforward, just pull the ABI from a verifier, and decode with a framework like in ethers.js: abiCoder.decode(abiParamInputs, calldata). It's really sad even this is lacking in a lot of the places.

ABI-decoding is not always this helpful though. Usually the called methods have a lot more complex arguments and not as simple as the one above.

4.2 Function Metadata

This category refers to leveraging function metadata, apart from the function itself (e.g. ABI) to provide more information about it.

ABI-Decoding + NatSpec

There's additional human-readable information found in well written Solidity contracts in the form of NatSpec, a commenting syntax.

NatSpec has @notice and @dev fields that are intended to explain the "user" and the "developer" respectively what this method does, as well as the @param method to document the "parameters".

Example from my talk at the BlockSplit 2022 conference (link to slides):

  /// @dev Allows to swap/replace an owner from the Safe with another address.
/// This can only be done via a Safe transaction.
/// @notice Replaces the owner `oldOwner` in the Safe with `newOwner`.
/// @param prevOwner Owner that pointed to the owner to be replaced in the linked list
/// @param oldOwner Owner address to be replaced.
/// @param newOwner New owner address.
function swapOwner(
address prevOwner,
address oldOwner,
address newOwner
) public authorized {...

So that a user can see a more human-readable message about the method they are calling:

userdoc: Replaces the owner `oldOwner` in the Safe with `newOwner`
devdoc: Allows to swap/replace an owner from the Safe with another address. This can only be done via a Safe transaction.
function: swapOwner(address prevOwner, address oldOwner, address newOwner)
- prevOwner:
- documentation: Owner that pointed to the owner to be replaced in the linked list
- value: 0x1F98431c8aD98523631AE4a59f267346ea31F984
- oldOwner:
- documentation: Owner address to be replaced.
- value: 0xcC60F45e0507032036033b361d3a6457b9F0283D
- newOwner:
- documentation: New owner address.
- value: 0x83D0360050703233b361d3a6457b9F2cC60F45e0

Additionally NatSpec also suggests using "dynamic expressions" to put variable names in backticks like `oldOwner` for the consumers of the spec to dynamically fill the values. With that a user calling this function can see a much more human readable message:

userdoc: Replaces the owner 0xcC60F45e0507032036033b361d3a6457b9F0283D in the Safe with 0x83D0360050703233b361d3a6457b9F2cC60F45e0

Note that these meaningful messages all assume you are talking to a benign and secure contract. A malicious contract can still have nice human readable NatSpec docs while doing a completely different thing than what it says. All points mentioned above like verification, labeling etc. are requirements for this to make any sense.

NatSpec documentation can be obtained by requesting userdoc and devdoc under outputSelection in compilers settings. Additionally it's found under Solidity contract metadata.json. The hash of this file is appended at the end of the contract and provides extra cryptographic guarantees about the verification. No other verifier leverages or provides this output except Sourcify (obtainable via APIv2). Also see to learn more.

We forked the aragon/radspec repo and played around with it as a Metamask Snap but didn't pursue this further for reasons mentioned in the beginning.

As said, NatSpec also falls short in many cases. First of all, it's not backwards compatible. We can't inject NatSpec to already deployed contracts. It's also not always sufficient to make sense of complex transactions. Here's a Uniswap call that actually isn't complex:

A Uniswap Call

commands? inputs? no idea what's going on. Looking at the NatSpec helps a little:

/// @notice Executes encoded commands along with provided inputs. Reverts if deadline has expired.
/// @param commands A set of concatenated commands, each 1 byte in length
/// @param inputs An array of byte strings containing abi encoded inputs for each command
/// @param deadline The deadline by which the transaction must be executed
function execute(bytes calldata commands, bytes[] calldata inputs, uint256 deadline) external payable;

At least now we know what the parameters mean but still we would have to look at the Commands implementation in the contract to make sense of it. NatSpec gets us so far.

ERC-4430: Described Transactions

This was an old EIP from @ricmoo and @arachnid to place the human-readable descriptions of functions in the contract itself.

The high-level idea is to have a "describer" function within the contract itself. Wallets etc. would call that describer to show the user a message before sending a tx.

As in the spec:

In many cases, the information that would be necessary for a meaningful description is not present in the final encoded transaction data or message data.

For example, the commit(bytes32) method of ENS places a commitment hash on-chain. The hash contains the blinded name and address; since the name is blinded, the encoded data (i.e. the hash) no longer contains the original values and is insufficient to access the necessary values to be included in a description.

By instead describing the commitment indirectly (with the original information intact: NAME, ADDRESS and SECRET) a meaningful description can be computed (e.g. "commit to NAME for ADDRESS (with SECRET)") and the matching data can be computed (i.e. commit(hash(name, owner, secret))).

The spec proposes an additional contract method such as:

function eip4430Describe(bytes inputs, bytes32 reserved) view returns (string description, bytes execcode)

Instead of calling transfer() directly, you'd call this function which would return your wallet a human-readable description and the execcode the wallet can execute in a subsequent transaction if the description is "accepted".

This is a more "onchain" solution that requires defining these descriptions inside the contract. But also because of that it's not backwards compatible for existing contracts.

Also related for signatures ERC-3224: Described Data

Rich Site-Proposed Contract Metadata

This was proposed by @danfinlay in Ethereum Magicians Forum but didn't proceed to an EIP.

My understanding of this proposal is that the first point of contact of the wallet to the contract (e.g. dapp the website) provides human-readable metadata to be saved in the wallet. From then on, the wallet always shows human-readable information to the user in the wallet.

This would be backwards compatible for existing contracts.

I like this proposal's approach to provide this information to the user in the first point of contact ever with the dApp. The website would be responsible for passing this information and in a kind of secure way if provided with HTTPS.

Function/Event Templating and ERC-7730

Similar to above proposal, we can generate a "function templates registry" that will map a chainId+address+function to a string template that can be dynamically filled and human-readable. While the above, from my understanding is more decentralized, that every dApp provides its own messages, here it's a centralized registry possibly with a template language.

E.g. the function or event on a specific contract

borrowAsset(uint256 _borrowAmount,uint256 _collateralAmount,address _receiver)
BorrowAsset (index_topic_1 address _borrower, index_topic_2 address _receiver, uint256 _borrowAmount, uint256 _sharesAdded)

can become "Borrowed {tokenAmount} {tokenName} on {dAppName}".

I guess a lot of the explorers, portfolio managing apps, wallets do this for popular contracts and functions. The problem again here is everyone's trying to solve it for themselves. There's no coordinated effort that I'm aware of that is open and solves this once and for all! (Edit: See below) Except maybe rotki because it's purely open-source but I don't know their registry or DB for this.

IMO it's fine that this is centralized initially and held in a place even like Github to begin with. Eventually there will be the problem of "who" provides these messages and if they are trusted parties. E.g. what if an attacker is able to inject a misleading message? Or is it going to be permissionless that attackers can also provide human-readable messages for their own malicious contracts. One can start small here with a trusted set of participants and expand slowly.

ERC-7730 (by Ledger)

Turns out ERC-7730 is exactly what I describe above (see full EIP). The spec proposes a JSON schema to provide human readable messages and annotate function parameters to format them nicely.

Example from a Lido contract:

  "display": {
"formats": {
"wrap(uint256)": {
"intent": "Exchange stETH to wstETH",
"fields": [
"path": "_stETHAmount",
"label": "Amount to exchange",
"format": "tokenAmount",
"params": { "token": "$.metadata.constants.stETHaddress" }

When the user calls wrap(10000000000000000000) they will be shown:

Review transaction to: Exchange stETH to wstETH 

Amount to exchange: 1 stETH

From what I understand this also supports nested calls or multicalls if this is specified properly.

The whole submission to wallet workflow is depicted nicely in this diagram from the spec:

Clear sign workflow

The biggest con I see with this initiative is it's branded everywhere as "by Ledger". Kudos to Ledger for coming up with this as an open spec but as long as this remains under Ledger's ownership in their repos, there's no incentive for other wallets to adopt this standard. Such an initiative should be led by a neutral entity or a consortium, and really not be marketed with the wallet's name everywhere. This is already mentioned in the original EIP:

  • Foundation operated repository, like ethereum chainID list: good alternative between decentralization and discoverability.
  • Ledger repository: as a short term solution, Ledger is providing a central repository (See Ledger GitHub repository)

To me this is the biggest drawback of this intiative and Ledger should stop marketing this through themselves if they really want this to succeed.

4.3 Transaction Simulation

This is one field that good progress was made. A lot of services provide tools or APIs to simulate the steps and outcomes of a transaction and it's somewhat well integrated into wallets.

A lot of services provide Transaction Simulation through their APIs and it should be easy to hook. Here are some I'm aware of:

Rabby Wallet does a great job here to show the simulation results to the user:

Rabby Uniswap swap simulation

But also as a reminder this is not a silver bullet again. For example it isn't able to find out what I'll receive from the CoWSwap swap, likely because how CoWSwap works with solving auctions etc:

Rabby Cowswap swap simulation

5. Browser Wallet to Hardware Wallet Integrity

Even if we did everything above perfectly, we wouldn't be able to prevent the ByBit attack because what the signers were seeing on their machines were benign but they were actually signing a malicious transaction. Of course they should've checked what they were signing on their hardware wallets but this is obviously not working. A lot of the users slack, don't want to read hexadecimals (understandably) and just accept what's provided.

I think there's a lot of room for improvement in the SW to HW wallet integrity field and a lot of low-hanging fruits.

One example I thought about in this tweet is using emojis to do integrity checks instead of hexadecimals.

8 emojis = 32 bytes, we could hash the tx content—if anything changes, the emojis shift completely.

  • Browser: 😒🙏👰👨‍🦼🦷👳‍♂️👷‍♀️🧓
  • HW Wallet: 😒🙏👰👨‍🦼🦷👳‍♂️👷‍♀️🧓

✅ Pass

  • Browser: 😒🙏👰👨‍🦼🦷👳‍♂️👷‍♀️🧓
  • HW Wallet: 😁☺️😾🤲🙎‍♂️☂️🩰🩳

❌ Fail

HW Wallet limitations

One thing to keep in mind is that a large portion on the hardware wallets are limited in their resources as well as their displays. There is so much information we can put in it. Newer wallets have larger and more information rich displays but they are also expensive. An ideal solution should cover cheaper alternatives too even if not fully backwards compatible to the exiting lower-grade hardware wallets.


The journey to solve blind signing has taught us some valuable lessons about the complexity of the problem and I wanted to share what we have found here. As you can see "transaction safety" is not just clear-signing and it's layers and layers of security and coordination. The clear-signing part also has had multiple attempts to solve and none is perfect. Still the main issue is none of the proposals is on production (except simulation) and they all remained as wishful specs. Having something is better than having nothing.

While we started with the goal of making transactions more human-readable, we discovered that this challenge requires a coordinated effort across the entire ecosystem. I am hoping the ByBit hack serves as a wake-up call to the entire ecosystem and we start working on this intentionally. As Sourcify we have more bandwidth compared to back than and will be willing to support the coordination efforts.

If you liked this, please share and spread the word and we can start some open discussion. If you have feedback or other ideas, also let me know on X! @kaanuzdogan

APIv2: Getting Verified Contracts

· 8 min read

We have been working on the new APIv2 for Sourcify and just shipped the first set of endpoints to lookup verified contracts and GET various types of information about the verified contract. The new information-rich endpoints allows selectively fetching the specific fields needed about the verified contract and provides a wide selection of fields.

In this post we will walkthrough the new functionality brought by these endpoints and see how to use them.

If you want to just look at the full up-to-age spec, you can go to:


APIv2 has been the biggest priority of our project since Q4 2024 and we aim to ship it fully by Q1 2025. We've been discussing and having fruitful conversations around how to design the new API as can be seen in the design issue. Thanks everyone who contributed to the conversation!

The main problems we wanted to solve with and the main features we wanted from the APIv2 were the following:

  • With the legacy API users had to wait for the response ie. wait for the compilation and the verification to finish. This can easily take couple minutes and the requests are left hanging. The new design should have ticketing/job-ids and users should poll with this id.
  • The "perfect" vs "partial" naming is confusing.
  • The legacy API is fully based on the metadata.json. While we want to keep full support for metadata.json verification and "perfect" matching, we wanted to have standard JSON input as our main endpoint's base.
  • We were able to share a lot more data around the verification after moving from a filesystem based storage to the database and legacy API didn't have this information.

You can read more in this issue.

In the end we settled for the following API design (as of 11 Feb 2025)

The design specifies endpoints to "Verify Contracts", check for "Verification Jobs", and to do "Contract Lookup". Here we'll talk about the "Contract Lookup" that we shipped. The others are still being developed.

New endpoints

With this release we have 2 new endpoints:

  • GET /v2/contracts/{chainId}: retrieve the latest verified contracts for a chain
  • GET /v2/contract/{chainId}/{address}: retrieve a specific contract and related information

GET /v2/contracts/{chainId}

This enpoint is fairly straightforward and returns an array of verified contracts. Users can provide the following parameters:

  • limit: number of contracts to return (max. 200)
  • sort: by most recent first (desc, default), or by oldest first (asc)
  • afterMatchId: The last matchId (an incremental contract ID) returned to get contracts older or newer than it (depending on sort)
"results": [
"match": "exact_match",
"creationMatch": "exact_match",
"runtimeMatch": "exact_match",
"chainId": "11155111",
"address": "0x7Bec3080cdf73a9a39997C860c19377Ac1E6E6BE",
"verifiedAt": "2025-02-11T11:49:45Z",
"matchId": "855557"

Check this example:

GET /v2/contract/{chainId}/{address}

This endpoint is the more interesting one, in that, it allows us to get all details beyond just if a contract is verified.

By default this endpoint returns the minimal verification information:

"matchId": "2115",
"creationMatch": "exact_match",
"runtimeMatch": "exact_match",
"verifiedAt": "2024-08-08T10:05:44Z",
"match": "exact_match",
"chainId": "1",
"address": "0x00000000219ab540356cBB839Cbe05303d7705Fa"

As you can see we no longer use perfect and partial to refer to matches. Instead we use exact_match and match respectively. This is because the wording "partial" was causing confusion leading users to think their contract is not verified. This way we both convey that the contract is indeed verified but also by "exact_match" we express this is a superior match than just a "match".

Above is the minimal contract information. Besides, users can choose additional fields by passing the fields query parameter or omit fields with omit.

For example, if you just need the ABI:

"abi": [
{ "type": "constructor", "inputs": [], "stateMutability": "nonpayable" },
"name": "supportsInterface",
"type": "function",
"inputs": [{ "name": "interfaceId", "type": "bytes4", "internalType": "bytes4" }],
"outputs": [{ "name": "", "type": "bool", "internalType": "bool" }],
"stateMutability": "pure"
"matchId": "2115",
"creationMatch": "exact_match",
"runtimeMatch": "exact_match",
"verifiedAt": "2024-08-08T10:05:44Z",
"match": "exact_match",
"chainId": "1",
"address": "0x00000000219ab540356cBB839Cbe05303d7705Fa"

Or you can ask for every single field by passing all:

Let's have a look at each field:

// This is the minimal verification information
"match": "match",
"creationMatch": "match",
"runtimeMatch": "match",
"chainId": "11155111",
"address": "0xDFEBAd708F803af22e81044aD228Ff77C83C935c",
"verifiedAt": "2024-07-24T12:00:00Z",
"matchId": "3266227",
// All information related to the creation bytecode of the contract is under this field.
"creationBytecode": {
"onchainBytecode": "0x608060405234801561001057600080fd5b5060043610610036570565b6000819050919050565b600080fd5b61010c816100f4565b811461011757600080fd5b5056fea264697066735821220404e37f487a89a932dca5e77faaf6ca2de3b991f93d230604b1b8daaef64766264736f6c63430008070033",
"recompiledBytecode": "0x608060405234801561001057600080fd5b5060043610610036570565b6000819050919050565b600080fd5b61010c816100f4565b811461011757600080fd5b5056fea264697066735821220404e37f487a89a932dca5e77faaf6ca2de3b991f93d230604b1b8daaef64766264736f6c63430008070033",
"sourceMap": "73951:11562:0:-:0;;;;;;;;;;;;-1:-1:-1;63357:7:0;:15;;-1:-1:-1;;63357:15:0;;;73951:11562;;;;;;",
// Positions of the linked library addresses of the given libraries in the bytecode. "evm.bytecode.linkReferences" output of the compiler
"linkReferences": {
"contracts/AmplificationUtils.sol": {
"AmplificationUtils": [
"start": 3078,
"length": 20
"contracts/SwapUtils.sol": {
"SwapUtils": [
"start": 2931,
"length": 20
// The position and the value of the CBOR auxdata (or metadata) section in the bytecode. See and for details
"cborAuxdata": {
"1": {
"value": "0xa2646970667358221220d6808f0352d5e503f1f878b19b1bf46c893bac1e20b3c51884efb58a87435b5564736f6c634300080a0033",
"offset": 18685
"2": {
"value": "0xa264697066735822122017bf4253b73b339897d7c117916781f30b434e6caa783b20eb15065469814dcf64736f6c634300080a0033",
"offset": 18465
// Transformations are the operations done on the compiled bytecode to reach the matching onchain bytecode.
// This is based on the Verified Alliance schema:
// Also read for more info:
// Creation bytecode can have "library", "cborAuxdata", and "constructorArguments" type transformations
"transformations": [
"id": "1",
"type": "replace",
"offset": 18040,
"reason": "cborAuxdata"
"type": "insert",
"offset": 6183,
"reason": "constructorArguments"
"id": "sources/lib/MyLib.sol:MyLib",
"type": "replace",
"offset": 582,
"reason": "library"
// Corresponding values for each transformation
"transformationValues": {
"libraries": {
"sources/lib/MyLib.sol:MyLib": "0x40b70a4904fad0ff86f8c901b231eac759a0ebb0"
"constructorArguments": "0x00000000000000000000000085fe79b998509b77bf10a8bd4001d58475d29386",
"cborAuxdata": {
"0": "0xa26469706673582212201c37bb166aa1bc4777a7471cda1bbba7ef75600cd859180fa30d503673b99f0264736f6c63430008190033"
// All information related to the runtime bytecode
"runtimeBytecode": {
"onchainBytecode": "0x608060405234801561001057600080fd5b5060043610610036570565b6000819050919050565b600080fd5b61010c816100f4565b811461011757600080fd5b5056fea264697066735821220404e37f487a89a932dca5e77faaf6ca2de3b991f93d230604b1b8daaef64766264736f6c63430008070033",
"recompiledBytecode": "0x608060405234801561001057600080fd5b5060043610610036570565b6000819050919050565b600080fd5b61010c816100f4565b811461011757600080fd5b5056fea264697066735821220404e37f487a89a932dca5e77faaf6ca2de3b991f93d230604b1b8daaef64766264736f6c63430008070033",
"sourceMap": "73951:11562:0:-:0;;;;;;;;;;;;-1:-1:-1;63357:7:0;:15;;-1:-1:-1;;63357:15:0;;;73951:11562;;;;;;",
// Same as in creation bytecode, but for runtime bytecode
"linkReferences": {
"contracts/AmplificationUtils.sol": {
"AmplificationUtils": [
"start": 3078,
"length": 20
"contracts/SwapUtils.sol": {
"SwapUtils": [
"start": 2931,
"length": 20
// Same as creation bytecode for runtime bytecode.
"cborAuxdata": {
"1": {
"value": "0xa2646970667358221220d6808f0352d5e503f1f878b19b1bf46c893bac1e20b3c51884efb58a87435b5564736f6c634300080a0033",
"offset": 18685
"2": {
"value": "0xa264697066735822122017bf4253b73b339897d7c117916781f30b434e6caa783b20eb15065469814dcf64736f6c634300080a0033",
"offset": 18465
// "evm.deployedBytecode.immutableReferences" output of the compiler
"immutableReferences": {
"1050": [
"start": 312,
"length": 32
"start": 2631,
"length": 32
// Same as the creation bytecode
// The runtime bytecode can take the following transformation types: "library", "cborAuxdata", "immutable", "callProtection"
"transformations": [
"id": "CriminalDogs.sol:SafeMath",
"type": "replace",
"offset": 1863,
"reason": "library"
"id": "1",
"type": "replace",
"offset": 2747,
"reason": "cborAuxdata"
"id": "1466",
"type": "replace",
"offset": 18703,
"reason": "immutable"
"id": "1466",
"type": "replace",
"offset": 18939,
"reason": "immutable"
"type": "replace",
"offset": 1,
"reason": "callProtection"
// Corresponding values for the transformations
"transformationValues": {
"libraries": {
"contracts/order/OrderUtils.sol:OrderUtilsLib": "0x40b70a4904fad0ff86f8c901b231eac759a0ebb0"
"immutables": {
"1466": "0x000000000000000000000000000000007f56768de3133034fa730a909003a165"
"cborAuxdata": {
"1": "0xa26469706673582212201c37bb166aa1bc4777a7471cda1bbba7ef75600cd859180fa30d503673b99f0264736f6c63430008190033"
"callProtection": "0x9deba23b95205127e906108f191a26f5d520896a"
// Information related to the onchain deployment of this contract
"deployment": {
"transactionHash": "0xb6ee9d528b336942dd70d3b41e2811be10a473776352009fd73f85604f5ed206",
"blockNumber": "21721660",
"transactionIndex": "0",
"deployer": "0xDFEBAd708F803af22e81044aD228Ff77C83C935c"
// The source files of this contract.
"sources": {
"contracts/Storage.sol": {
"content": "// SPDX-License-Identifier: MIT\npragma solidity ^0.8.0;\n\ncontract Storage {\n uint256 number;\n\n function setNumber(uint256 newNumber) public {\n number = newNumber;\n }\n\n function getNumber() public view returns (uint256) {\n return number;\n }\n}\n"
"contracts/Owner.sol": {
"content": "// SPDX-License-Identifier: MIT\npragma solidity ^0.8.0;\n\ncontract Owner {\n address public owner;\n\n constructor() {\n owner = msg.sender;\n }\n}\n"
// Compilation related information
"compilation": {
"language": "Solidity",
"compiler": "solc",
"compilerVersion": "v0.8.12+commit.f00d7308",
"compilerSettings": {},
"name": "MyContract",
"fullyQualifiedName": "contracts/MyContract.sol:MyContract"
"abi": [
"userdoc": {},
"devdoc": {},
"storageLayout": {},
// metadata.json output of the Solidity compiler. For Vyper contracts, Sourcify generates and writes a metadata file on its own for compatibility reasons.
"metadata": {},
// This essentially contains duplicate information as above.
// The purpose of this field is to easily integrate into tooling that uses the standard JSON syntax.
"stdJsonInput": {
"language": "Solidity",
"sources": {
"contracts/Storage.sol": {
"content": "// SPDX-License-Identifier: MIT\npragma solidity ^0.8.0;\n\ncontract Storage {\n uint256 number;\n\n function setNumber(uint256 newNumber) public {\n number = newNumber;\n }\n\n function getNumber() public view returns (uint256) {\n return number;\n }\n}\n"
"contracts/Owner.sol": {
"content": "// SPDX-License-Identifier: MIT\npragma solidity ^0.8.0;\n\ncontract Owner {\n address public owner;\n\n constructor() {\n owner = msg.sender;\n }\n}\n"
// compilation.compilerSettings above
"settings": {}
// Similarly here for easy tooling integration as the stdJsonInput.
// Only contains the target contract under "contracts".
"stdJsonOutput": {
"sources": {},
"contracts": {}
// Proxy information. The proxy resolution is done on the fly on every call.
"proxyResolution": {
"isProxy": true,
"proxyType": "ZeppelinOSProxy",
"implementations": [
"address": "0x43506849D7C04F9138D1A2050bbF3A0c054402dd"

You can be really specific about the fields you need:,compilation.language,deployment.transactionHash

or just omit the fields you don't need,runtimeBytecode,creationBytecode

Proxy Resolution

The API uses the WhatsABI library to resolve proxies from the runtime bytecode of a contract.

These are the supported proxy types as of now:

export type ProxyType =
| "EIP1167Proxy"
| "FixedProxy"
| "EIP1967Proxy"
| "GnosisSafeProxy"
| "DiamondProxy"
| "ZeppelinOSProxy"
| "SequenceWalletProxy";

See proxy-utils.ts to see how resolution is done.

Next Steps

As said, this is just the lookup endpoints of the APIv2. In the upcoming weeks we will be developing the verificaiton and verificationJob endpoints that will support ticketing and polling, instead of hanging requests.

Once again, you can see the full up-to-date API spec at

You can follow along the development in our tracker issue:

Beyond that, you can see our roadmap in our Milestones View for the next quarters and what we are currently working in our Sprint Board. We welcome feedback and discussions!

A Technical Walkthrough of Source Code Verification

· 19 min read

Transparency, verifiability, and trustlessness are the core values of blockchains and Ethereum especially. We want the smart contracts we are interacting with to be open-source (right?). However you can't be sure if the open-source code you see is actually the one that lives on chain. I can show you a benign code on GitHub etc. and convince you to send your assets to a contract, but in reality it could be a malicious contract that's actually deployed at this address.

This is where source code verification comes in. Source code verification makes sure the human-readable source code you see is the same as the one that was deployed on chain.


A verified contract does not necessarily mean it is safe to interact with it. The verification does not look into what the contract does, but only that it corresponds to this source-code. The source-code itself can be malicious and contain bugs. It is the auditors' and the community's responsibility to verify the code's security.

Smart contracts are written in human-readable programming languages like Vyper or Solidity. But they are compiled to and deployed in bytecode (1s and 0s), so they are not human-readable.


How a contract looks like on Ethereum - 0x7ecedB5ca848e695ee8aB33cce9Ad1E1fe7865F8 on Ethereum Holesky Testnet

We can go from Solidity/Vyper code to bytecode, but not the other way around. This information is lost during compilation and we need the original source-code that compiles down to this bytecode.

In simple terms, source code verification works by:

  1. Taking a smart-contract written in a human-readable programming language (Solidity/Vyper)
  2. Compiling it down to bytecode
  3. Comparing the compiled bytecode with the on-chain bytecode that is deployed at a certain chain and address.

Of course this is a simplified explanation and there are some nuances to it. In this blog post, we will dive deeper into the technical details of the process and walkthrough how a verification works behind the scenes.

How to Verify?

When you want to verify a contract on Sourcify we need the following:

  • chainId where the contract is deployed
  • address of the contract
  • metadata.json: The Solidity Contract Metadata file. This JSON file is output by the compiler and contains information about how to interact with this contract (abi, userdoc, devdoc), and how to reproduce the compilation (compilation settings, source hashes or contents)
  • Source files outlined in the metadata.json.

Say we wanted to verify the contract 0x48A331150C1b442444F4f0371a4daC9Ab2FC837D on Holesky Testnet.

User request to Sourcify:

"address": "0x48A331150C1b442444F4f0371a4daC9Ab2FC837D",
"chainId": "17000", // Ethereum Holesky testnet
"files": {
"metadata.json": "{\"compiler\":{\"version\":\"0.8.7+commit.e28d00a7\"},\"language\":\"Solidity\",\"output\":{\"abi\":[{\"inputs\":[],\"name\":\"retrieve\",\"outputs\":[{\"internalType\":\"uint256\",\"name\":\"\",\"type\":\"uint256\"}],\"stateMutability\":\"view\",\"type\":\"function\"},{\"inputs\":[{\"internalType\":\"uint256\",\"name\":\"num\",\"type\":\"uint256\"}],\"name\":\"store\",\"outputs\":[],\"stateMutability\":\"nonpayable\",\"type\":\"function\"}],\"devdoc\":{\"details\":\"Store & retrieve value in a variable\",\"kind\":\"dev\",\"methods\":{\"retrieve()\":{\"details\":\"Return value \",\"returns\":{\"_0\":\"value of 'number'\"}},\"store(uint256)\":{\"details\":\"Store value in variable\",\"params\":{\"num\":\"value to store\"}}},\"title\":\"Storage\",\"version\":1},\"userdoc\":{\"kind\":\"user\",\"methods\":{},\"version\":1}},\"settings\":{\"compilationTarget\":{\"contracts/1_Storage.sol\":\"Storage\"},\"evmVersion\":\"london\",\"libraries\":{},\"metadata\":{\"bytecodeHash\":\"ipfs\"},\"optimizer\":{\"enabled\":false,\"runs\":200},\"remappings\":[]},\"sources\":{\"contracts/1_Storage.sol\":{\"keccak256\":\"0xb6ee9d528b336942dd70d3b41e2811be10a473776352009fd73f85604f5ed206\",\"license\":\"GPL-3.0\",\"urls\":[\"bzz-raw://fe52c6e3c04ba5d83ede6cc1a43c45fa43caa435b207f64707afb17d3af1bcf1\",\"dweb:/ipfs/QmawU3NM1WNWkBauRudYCiFvuFE1tTLHB98akyBvb9UWwA\"]}},\"version\":1}",
"1_Storage.sol": "// SPDX-License-Identifier: GPL-3.0\n\npragma solidity >=0.7.0 <0.9.0;\n\n/**\n * @title Storage\n * @dev Store & retrieve value in a variable\n */\ncontract Storage {\n\n uint256 number;\n\n /**\n * @dev Store value in variable\n * @param num value to store\n */\n function store(uint256 num) public {\n number = num;\n }\n\n /**\n * @dev Return value \n * @return value of 'number'\n */\n function retrieve() public view returns (uint256){\n return number;\n }\n}"

For now the metadata.json is the main way to submit contracts to Sourcify. All other methods such as "Import from Etherscan" all workaround to generate the metadata file in some way, and continue verification from there. This is will change with the Vyper verification and in our APIv2. In Vyper, users will give us a standard JSON and we generate a "fake" metadata.json for backward compatability reasons. In our APIv2 we will no longer use the metadata file as the base of our verification, but the standard JSON as the base.

Example metadata.json file
"compiler": { "version": "0.8.7+commit.e28d00a7" },
"language": "Solidity",
"output": {
"abi": [
"inputs": [],
"name": "retrieve",
"outputs": [{ "internalType": "uint256", "name": "", "type": "uint256" }],
"stateMutability": "view",
"type": "function"
"inputs": [{ "internalType": "uint256", "name": "num", "type": "uint256" }],
"name": "store",
"outputs": [],
"stateMutability": "nonpayable",
"type": "function"
"devdoc": {
"details": "Store & retrieve value in a variable",
"kind": "dev",
"methods": {
"retrieve()": { "details": "Return value ", "returns": { "_0": "value of 'number'" } },
"store(uint256)": { "details": "Store value in variable", "params": { "num": "value to store" } }
"title": "Storage",
"version": 1
"userdoc": { "kind": "user", "methods": {}, "version": 1 }
"settings": {
"compilationTarget": { "contracts/1_Storage.sol": "Storage" },
"evmVersion": "london",
"libraries": {},
"metadata": { "bytecodeHash": "ipfs" },
"optimizer": { "enabled": false, "runs": 200 },
"remappings": []
"sources": {
"contracts/1_Storage.sol": {
"keccak256": "0xb6ee9d528b336942dd70d3b41e2811be10a473776352009fd73f85604f5ed206",
"license": "GPL-3.0",
"urls": [
"version": 1

Anyway for now we continue with our metadata.json. To be able to compile we first need to make sure we have everything we need, that are, all the source files and settings outlined in the metadata.json. The JSON file has a sources field that looks like this

  "sources": {
"contracts/1_Storage.sol": {
"keccak256": "0xb6ee9d528b336942dd70d3b41e2811be10a473776352009fd73f85604f5ed206",
"license": "GPL-3.0",
"urls": [

And with the file hashes we can validate the source files user gave us and make sure we have all the source files needed. Taking the user request, we:

  1. Find the metadata file in the user request .files (assume the rest are source files)
  2. Hash all other files and keep their keccak
  3. Additionally generate new line and whitespace variations of the source files to account for the OS and platform differences.
  4. Try to mark out all entries in sources according to their keccak256
  5. If there are sources missing, try to fetch them from their IPFS hash e.g. dweb:/ipfs/Qmf2J3oWXHBnoNcGXkNYcSirUvJZebXa3j3Cn3vccqS1x7
  6. If we still have missing sources, tell user what's missing

Assuming we found all the sources we go forward with the compilation with the language, compiler.version and the settings field in the metadata.json:

"compiler": {
"version": "0.8.7+commit.e28d00a7"
"language": "Solidity",
"settings": {
"compilationTarget": {
"contracts/1_Storage.sol": "Storage"
"evmVersion": "london",
"metadata": {
"bytecodeHash": "ipfs"
"optimizer": {
"enabled": false,
"runs": 200

With this information we create a standard JSON input file and feed it to the compiler.

The compiler will give us an output that looks like this:

"contracts": {
"contracts/1_Storage.sol": {
"Storage": {
"abi": [{"inputs":[],"name":"retrieve","outputs":[{"internalType":"uint256","name":"","type":"uint256"}],"stateMutability":"view","type":"function"},{"inputs":[{"internalType":"uint256","name":"num","type":"uint256"}],"name":"store","outputs":[],"stateMutability":"nonpayable","type":"function"}],
"devdoc": {"details":"Store & retrieve value in a variable","kind":"dev","methods":{"retrieve()":{"details":"Return value ","returns":{"_0":"value of 'number'"}},"store(uint256)":{"details":"Store value in variable","params":{"num":"value to store"}}},
"evm": {
"bytecode": {
"generatedSources": [],
"linkReferences": {},
"object": "608060405234801561001057600080fd5b50610150806100206000396000f3fe608060405234801561001057600080fd5b50600436106100365760003560e01c80632e64cec11461003b5780636057361d14610059575b600080fd5b610043610075565b60405161005091906100d9565b60405180910390f35b610073600480360381019061006e919061009d565b61007e565b005b60008054905090565b8060008190555050565b60008135905061009781610103565b92915050565b6000602082840312156100b3576100b26100fe565b5b60006100c184828501610088565b91505092915050565b6100d3816100f4565b82525050565b60006020820190506100ee60008301846100ca565b92915050565b6000819050919050565b600080fd5b61010c816100f4565b811461011757600080fd5b5056fea2646970667358221220404e37f487a89a932dca5e77faaf6ca2de3b991f93d230604b1b8daaef64766264736f6c63430008070033",
"sourceMap": "141:356:0:-:0;;;;;;;;;;;;;;;;;;;"
"deployedBytecode": {
"immutableReferences": {},
"linkReferences": {},
"object": "608060405234801561001057600080fd5b50600436106100365760003560e01c80632e64cec11461003b5780636057361d14610059575b600080fd5b610043610075565b60405161005091906100d9565b60405180910390f35b610073600480360381019061006e919061009d565b61007e565b005b60008054905090565b8060008190555050565b60008135905061009781610103565b92915050565b6000602082840312156100b3576100b26100fe565b5b60006100c184828501610088565b91505092915050565b6100d3816100f4565b82525050565b60006020820190506100ee60008301846100ca565b92915050565b6000819050919050565b600080fd5b61010c816100f4565b811461011757600080fd5b5056fea2646970667358221220404e37f487a89a932dca5e77faaf6ca2de3b991f93d230604b1b8daaef64766264736f6c63430008070033",
"sourceMap": "141:356:0:-:0;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;416:79;;;:::i;:::-;;;;;;;:::i;:::-;;;;;;;;271:64;;;;;;;;;;;;;:::i;:::-;;:::i;:::-;;416:79;457:7;482:6;;475:13;;416:79;:::o;271:64::-;325:3;316:6;:12;;;;271:64;:::o;7:139:1:-;53:5;91:6;78:20;69:29;;107:33;134:5;107:33;:::i;:::-;7:139;;;;:::o;152:329::-;211:6;260:2;248:9;239:7;235:23;231:32;228:119;;;266:79;;:::i;:::-;228:119;386:1;411:53;456:7;447:6;436:9;432:22;411:53;:::i;:::-;401:63;;357:117;152:329;;;;:::o;487:118::-;574:24;592:5;574:24;:::i;:::-;569:3;562:37;487:118;;:::o;611:222::-;704:4;742:2;731:9;727:18;719:26;;755:71;823:1;812:9;808:17;799:6;755:71;:::i;:::-;611:222;;;;:::o;920:77::-;957:7;986:5;975:16;;920:77;;;:::o;1126:117::-;1235:1;1232;1225:12;1249:122;1322:24;1340:5;1322:24;:::i;:::-;1315:5;1312:35;1302:63;;1361:1;1358;1351:12;1302:63;1249:122;:::o"
"legacyAssembly": {},
"metadata": "{\"compiler\":{\"version\":\"0.8.7+commit.e28d00a7\"},\"language\":\"Solidity\",\"output\":{\"abi\":[{\"inputs\":[],\"name\":\"retrieve\",\"outputs\":[{\"internalType\":\"uint256\",\"name\":\"\",\"type\":\"uint256\"}],\"stateMutability\":\"view\",\"type\":\"function\"},{\"inputs\":[{\"internalType\":\"uint256\",\"name\":\"num\",\"type\":\"uint256\"}],\"name\":\"store\",\"outputs\":[],\"stateMutability\":\"nonpayable\",\"type\":\"function\"}],\"devdoc\":{\"details\":\"Store & retrieve value in a variable\",\"kind\":\"dev\",\"methods\":{\"retrieve()\":{\"details\":\"Return value \",\"returns\":{\"_0\":\"value of 'number'\"}},\"store(uint256)\":{\"details\":\"Store value in variable\",\"params\":{\"num\":\"value to store\"}}},\"title\":\"Storage\",\"version\":1},\"userdoc\":{\"kind\":\"user\",\"methods\":{},\"version\":1}},\"settings\":{\"compilationTarget\":{\"contracts/1_Storage.sol\":\"Storage\"},\"evmVersion\":\"london\",\"libraries\":{},\"metadata\":{\"bytecodeHash\":\"ipfs\"},\"optimizer\":{\"enabled\":false,\"runs\":200},\"remappings\":[]},\"sources\":{\"contracts/1_Storage.sol\":{\"keccak256\":\"0xb6ee9d528b336942dd70d3b41e2811be10a473776352009fd73f85604f5ed206\",\"license\":\"GPL-3.0\",\"urls\":[\"bzz-raw://fe52c6e3c04ba5d83ede6cc1a43c45fa43caa435b207f64707afb17d3af1bcf1\",\"dweb:/ipfs/QmawU3NM1WNWkBauRudYCiFvuFE1tTLHB98akyBvb9UWwA\"]}},\"version\":1}",
"storageLayout": {
"storage": [
"astId": 4,
"contract": "contracts/1_Storage.sol:Storage",
"label": "number",
"offset": 0,
"slot": "0",
"type": "t_uint256"
"types": { "t_uint256": { "encoding": "inplace", "label": "uint256", "numberOfBytes": "32" } }
"userdoc": { "kind": "user", "methods": {}, "version": 1 }
"sources": { "contracts/1_Storage.sol": { "id": 0 } }

We are only interested in the compilationTarget (e.g. contracts/1_Storage.sol and Storage) and not all other contracts provided. The two fields of our interest are evm.bytecode.object and evm.deployedBytecode.object.

Onchain vs. Recompiled Bytecodes

Essentially we need to compare some onchain data against a compilation. We will find the onchain bytecodes using the chainId + address provided through an RPC. The recompiled bytecodes will come from the compilation output above.


Onchain vs recompiled refer to where these bytecodes are coming from. Onchain, obviously, comes from a contract that is deployed on a live blockchain. Recompiled, comes from a compilation we perform from source code to obtain bytecodes.

Runtime vs. Creation Bytecodes

We talked about the onchain and recompiled bytecodes. In addition to this, there are also two associated bytecode types to a contract we can perform the verification on.

Runtime bytecode is the bytecode that gets stored at the contract's address and that will be executed when this contract is called.

  • The recompiled runtime bytecode is found under evm.deployedBytecode field of the compiler outputs.
  • The onchain runtime bytecode is quite straightforward to get. It can be obtained with provider.getCode(0xabc..def), essentially with the eth_getCode RPC call to a node running the chain we are interested in with the chainId.

Creation bytecode or sometimes referred to as the "initcode" is the code that will be executed by the EVM to deploy this contract.

  • The recompiled creation bytecode is found under evm.bytecode field of the compiler outputs.
  • The onchain creation bytecode is a little trickier.
    • Typically, when a contract is being created by an EOA the receiver of the transaction is set to zero (, and the creation bytecode is placed in the transaction payload (tx.input or The EVM interprets this as a contract creation and executes the transaction payload.
    • If a contract is created by another contract (factory pattern), we are going to have to look into the transaction traces, ie. every single step within the transaction execution to find the exact place where this contract's creation code was executed. Unfortunately this data is not easily available from RPCs like the runtime bytecode.

To be able to say "Yes this source code is the source code of this contract 0x48A331150C1b442444F4f0371a4daC9Ab2FC837D on Ethereum Holesky Testnet", we will compare the following and try to get a match:

  • onchain vs. recompiled runtime bytecodes
  • onchain vs. recompiled creation bytecodes

Matching and Transformations

Now we are comparing the bytecodes.

If you're lucky you get a 100% match and you're done. Take this contract:

// SPDX-License-Identifier: GPL-3.0

pragma solidity >=0.8.2 <0.9.0;

contract Storage {

uint256 number;

function store(uint256 num) public {
number = num;

function retrieve() public view returns (uint256){
return number;
"compiler": {
"version": "0.8.26+commit.8a97fa7a"
"language": "Solidity",
"output": {
"abi": [
"inputs": [

"name": "retrieve",
"outputs": [
"internalType": "uint256",
"name": "",
"type": "uint256"
"stateMutability": "view",
"type": "function"
"inputs": [
"internalType": "uint256",
"name": "num",
"type": "uint256"
"name": "store",
"outputs": [

"stateMutability": "nonpayable",
"type": "function"
"devdoc": {
"kind": "dev",
"methods": {

"version": 1
"userdoc": {
"kind": "user",
"methods": {

"version": 1
"settings": {
"compilationTarget": {
"contracts/1_Storage.sol": "Storage"
"evmVersion": "cancun",
"libraries": {

"metadata": {
"bytecodeHash": "ipfs"
"optimizer": {
"enabled": false,
"runs": 200
"remappings": [

"sources": {
"contracts/1_Storage.sol": {
"keccak256": "0x37ee358e0c9d3c9a75b75a2723ad8ab652c9e93ca38954426a5e9f8b80b83452",
"license": "GPL-3.0",
"urls": [
"version": 1

Deployed at 0x48A331150C1b442444F4f0371a4daC9Ab2FC837D on Holesky Testnet, it has the following bytecodes:

Runtime Bytecodes: Onchain vs. Recompiled:


Creation Bytecodes: Onchain vs. Recompiled:


You'll see both the runtime and creation bytecodes match between the onchain and recompiled ones. So we say that the source-code of the contract 0x48A331150C1b442444F4f0371a4daC9Ab2FC837D on the Holesky Testnet is the one above.

But most of the times this is not as straightforward.

A lot of the times, parts of the bytecode are modified when the contract is being deployed. This is what we refer to as "transformations" within Sourcify and the Verifier Alliance. There are certain patterns in Solidity and Vyper contracts that can change between the recompiled vs. onchain bytecodes but even with different values, the contract behaves the same. You can guess that these transformations are "data" or "metadata" of the contract embedded in its bytecode, and not the functional bytecodes i.e. opcodes.

In the Verifier Alliance's json-schemas we define the possible transformations for the creation bytecode and the runtime bytecode as such:

Creation Bytecode Transformations

There are 3 transformations possible as creation_transformations:

  • constructorArguments
  • cborAuxdata
  • library

Constructor Arguments

The constructor arguments of a contract are ABI-encoded and appended at the end of the onchain creation bytecode.

As constructor arguments are only appended to the compiled bytecode, this is an insert (append) type transformation.


For a contract with constructor arguments

// SPDX-License-Identifier: GPL-3.0

pragma solidity >=0.7.0 <0.9.0;

contract Owner {

address private owner;

event OwnerSet(address indexed oldOwner, address indexed newOwner);

modifier isOwner() {
require(msg.sender == owner, "Caller is not owner");
constructor(address passedOwner) {
owner = passedOwner; // 'msg.sender' is sender of current call, contract deployer for a constructor
emit OwnerSet(address(0), owner);

function changeOwner(address newOwner) public isOwner {
require(newOwner != address(0), "New owner should not be the zero address");
emit OwnerSet(owner, newOwner);
owner = newOwner;

function getOwner() external view returns (address) {
return owner;

The recompiled creation bytecode:


The transaction 0x0f2ab4fbc424947d36039086481ffe083d1a710df1c95a6b480ec31cf1919ebf that created the contract has the following payload:


Looking at the end of the contract. Recompiled creation bytecode:


Transaction paylaod (onchain creation bytecode):


You can see the following bytes are appended (insert), which is the passedOwner constructor argument in the code:


CBOR Auxdata

By default both the Solidity and Vyper compilers add some metadata to contract's bytecode. This is written in CBOR encoding in bytes. For Solidity contracts you can check

This section does not contain any opcodes or executable bytes and therefore independent from the source code we are trying to verify against. So two contracts with different cborAuxdata sections but with the rest of the bytecodes matching should match. However, Solidity and recently Vyper place an integrity hash within this cbor encoded section. If in addition to the rest of the bytecode the integrity hashes match too, this means the compilation is 100% identical with the original one, not even a whitespace or a comment in the source code is different.

However it is not straightforward to find where these sections are in the bytecode. You can read our blog post for more technical info how to do this.

To read more about how this helps with verification see our docs

Since we replace a cborAuxdata with another, this is a replace type transformation.


Our previous contract 0x48A331150C1b442444F4f0371a4daC9Ab2FC837D on Holesky had the following source code:

// SPDX-License-Identifier: GPL-3.0

pragma solidity >=0.8.2 <0.9.0;

contract Storage {

uint256 number;

function store(uint256 num) public {
number = num;

function retrieve() public view returns (uint256){
return number;

Now if we add a comment to the source code, the functionality of the code will not change but the cborAuxdata will change because the metadata hash will be different:

contract Storage {
// This is a comment
uint256 number;

In this case the cborAuxdata parts will be different.

The onchain runtime bytecode of 0x48A331150C1b442444F4f0371a4daC9Ab2FC837D


The recompiled runtime bytecode:


You will see two cborAuxdata parts will be different:



Libraries (in Solidity) are contracts that are deployed once to an address and can be used multiple times. If a contract is using a deployed library, the address of this library is embedded inside bytecode.

The process of passing the deployed library's address to the contract being compiled is called "library linking". Normally, this can be done by passing the libraries field to the compiler or with the --libraries flag. Otherwise the compiler will put placeholders within the bytecode where the deployed library addresses will be placed.

The placeholders look like this:


So after a compilation, if a contract has unlinked libraries, these placeholders will be replaced with a contract address. This is also a replace type transformation.


Take the following contract:

// SPDX-License-Identifier: MIT
pragma solidity ^0.8.0;

library Lib {
function sum(uint256 a, uint256 b) external pure returns (uint256) {
return a + b;

contract A {
function sum(uint256 a, uint256 b) external pure returns (uint256) {
return Lib.sum(a, b);

Deploying this contract entails two transactions:

  1. Deploy the library Lib in tx 0x5bc1a60a9b2107f78d20384d1b3cf7358bc460b650d55222888ea6252429cdcb at the address 0x99F7C0086ab897C8fE65eBfF268c732a7e15b25F
  2. Deploy the contract A in tx 0xc18e07649930333d7c8786c3e23ad01a6109f3cad516a01adeca90ebf12a886d at the address 0xb98c7040B589F4fEFa8690DC71Ae0FD602661F43.

If we look at the recompiled bytecodes (both runtime and creation) we can see the following section:


and in the onchain runtime and creation bytecodes this will be replaced with:


where the difference 99f7c0086ab897c8fe65ebff268c732a7e15b25f is indeed the address of the library contract.

Keep in mind that the linked was handled by Remix manually and wasn't done through the compiler with the .libraries field. If properly linked via the compiler, there wouldn't be any placeholders. However, Remix handles both first deploying the library and manually linking the library in the contract's bytecode later for us.

Runtime Bytecode Transformations

We talked about what transformation can be done to the creation bytecodes. The transformations on runtime bytecode are similar to the creation bytecode but with two differences.

First, since constructor arguments can only exist for the creation bytecode, the runtime bytecode does not have a constructorArguments type transformation.

Second, in addition the runtime bytecode can have an immutable transformation.

The library and cborAuxdata transformations for the runtime and creation bytecodes are exactly the same.


Immutables are contract variables that can only be set at the deploy time, cannot be changed, and embeded within the contract's bytecode itself, in contrast to other variables persisted in the contract storage.

Solidity and Vyper compilers handle immutables differently. For Vyper contracts, the immutable variables are always at the end of the runtime bytecode, so these are appened. Therefore, Vyper contracts will have an insert type transformation. In Solidity, immutables can be anywhere in the bytecode but their positions will be output within the immutableReferences field. Using this compiler output, we can apply the transformation on the recompiled runtime bytecode and see if it matches the onchain runtime bytecode.

Solidity Example
pragma solidity >=0.7.0;

contract WithImmutables {
uint256 public immutable _a;

string _name;

constructor (uint256 a) {
_a = a;

function sign(string memory name) public {
_name = name;

function read() public view returns(string memory) {
return _name;

Deployed at transaction 0x425712a9d950f7135650fd2652c528112ea7c5c5f669b843c0721a33512b9e7f with the constructor argument of value 255 (0xff in hex)

We can see the immutable both being passed as the constructor argument in the transaction payload (creation bytecode):


and embeded in the runtime bytecodes.

The recompiled runtime bytecode:


The onchain runtime bytecode:

Vyper Example
# pragma version ^0.4.0

OWNER: public(immutable(address))
MY_IMMUTABLE: public(immutable(uint256))

def __init__(val: uint256):
OWNER = msg.sender

def get_my_immutable() -> uint256:

Here we have two immutables: one being set to msg.sender and the other is passed as a constructor argument.

The contract is deployed at the transaction 0xd9b3db6bbe693a0aa8feebea814c3653014e19568e30f77722cecfed9cfd9f05.

We can see the MY_IMMUTABLE's passed value 255 (0xff in hex) in our constructor arguments appended and passed in the transaction payload (creation bytecode):


And embeded (appended) to the runtime bytecodes:

Recompiled runtime bytecode (immutables will be missing):


Onchain runtime bytecode:


Both the OWNER and MY_IMMUTABLE are appended to the runtime bytecode.

Call Protection

Since libraries are not meant to be called with CALL instead of DELEGATECALL or CALLCODE (except view and pure functions), the Solidity compiler places a check for who's calling in the beginning of the library bytecodes. This callProtection starts with 0x73 (PUSH20) followed by the address of the contract itself and checks the current address against this.

At the deploy time, these 20 bytes will be replaced with the contract's own address.

See Solidity docs.


In our previous library example we can look at the library Lib's (deployed at 0x99F7C0086ab897C8fE65eBfF268c732a7e15b25F) bytecodes. Since the call protection is not present before the deployment, i.e. in the creation bytecode, we'll only look at the runtime bytecodes:

Recompiled runtime bytecode:


Onchain runtime bytecode:


After matching

If we got a match, it means that the provided source code and metadata can be associated with the contract at this chainId and address and we mark this contract as verified. It also means we can start from recompiled bytecodes, apply each runtime or creation transformations on top of the respective recompiled bytecodes, and obtain the onchain bytecodes.

Finally, we store our contract in our storage backend of choice. Currently this can be a file system or a SQL database in Sourcify's case. By default we do both. See the Contract Repository docs for more information.

EOF and the road ahead

Big changes to how the EVM bytecode is structured is coming up with the EVM Object Format proposal. According to how the compilers implement these changes, the verification process face changes and verifying pre-EOF contracts will be different than the EOF contracts.

One nice feature of the EOF is it separates code from data and gives structure to code. This makes us verifiers' lives much easier compared to the completely unstructured legacy EVM code. The legacy EVM example can be seen in our post Finding Auxdatas in the Bytecode. However as of the current specification, the metadata is still not fully isolated from the contract code as there is no separate section for that information. Current compiler implementations put the metadata in the data_section of EOF and this unfortunately does affect the contract's code. That's why we are proposing a separate metadata_section in the EVM Object Format that cannot be reached by the EVM and any changes to that will not affect the contract's code. If this is implemented, verifiers can completely ignore the metadata parts of a container, and won't have to look for workarounds.

You can check our proposal EIP-7834

Finding Auxdatas in the Bytecode

· 8 min read

The problem

Source code verification requires compiling a contract written in a high-level language (e.g. Solidity, Vyper) to the bytecode, and comparing the compiled bytecode with the onchain bytecode. If there’s a match, we can say the given high-level code is the source-code of the contract at the given address.

The runtime bytecode of contracts by default also contain a special field at the end in CBOR encoding (auxdata). This field contains the hash of the contract metadata file (metadata hash), which acts as a fingerprint of the compilation. The metadata file has compiler settings, and source file hashes so the slightest change in the compiler settings or even a whitespace in any of the source files will cause a change in the metadata hash.

For a visual explanation of everything above, check out

Because of its sensitivity, some verifiers leave this field out in verification. In Sourcify’s case, if the recompiled bytecode and the onchain bytecodes match each other exactly (including the auxdata), it’s great. This will give us a “full match”. If not, we need to find the auxdatas and leave them out when comparing to be able to get at least a "partial match".

However this is not always trivial especially in these cases:

  1. The creation bytecode of a contract does not necessarily have the CBOR encoded part at the very end of the bytecode. Although sometimes it’s found there, this field can be anywhere. In fact the only reason the CBOR encoded part is in the creation bytecode is because the runtime bytecode is embedded inside the creation bytecode as a whole.

When executing the creation bytecode i.e. deploying the contract, the contract’s runtime bytecode needs to be returned. The runtime bytecode is already inside the creation bytecode so this part is extracted and returned by taking the offset and the length for the related bytecode and returning it. This can be anywhere inside the code. (Check this article for a comprehensive deep dive into contract creation) 2. The runtime bytecode has the CBOR encoded part always at the end of the contract (unless turned off with appendCbor: false). But the bytecode can contain other contract bytecodes nested inside, which also can have their own auxdatas, and these parts need to be ignored for a verification. This is found for example in factory contracts where a contract creates another contract and the child contract’s code is nested in the factory’s bytecode.

Now for other “special” parts of the bytecode, the compiler outputs the positions such as immutables in immutableReferences. Unfortunately this is not the case for auxdatas and we need to look elsewhere and find workarounds.


If not the exact positions of the auxdatas, the compiler at least outputs the values. Inside the legacyAssembly object of the compiler output we can find the auxdata, which is under the key .auxdata

example legacyAssembly:

".code": [],
".data": {
"0": {
".auxdata": "a26469706673582212203a05097003697b26b1da819218bcd95779598eaa90539e82a59ecbe4c09757e364736f6c63430007000033",
".code": [...]

At this point, one could think to do a simple string search in the bytecode for the auxdatas found in legacyAssembly, but it would be possible for an attacker to trick the search function and falsely ignore parts of the bytecode that are not supposed to be ignored.

The vulnerability

Imagine we have the auxdata string from the compiler’s legacyAssembly above.


This could be the auxdata of a simple child contract inside the whole bytecode that we know won’t be affected by the changes of our main contract.

For this specific example the attacker could embed these bytes inside the bytecode such a code in the main contract:

assembly {
// Split the code from a push opcode:
// a26469706673582212203a05097003697b26b1da819218bcd957
// 79 (PUSH26)
// 598eaa90539e82a59ecbe4c09757e364736f6c63430007000033

mstore(0x598eaa90539e82a59ecbe4c09757e364736f6c63430007000033, 0xa26469706673582212203a05097003697b26b1da819218bcd957)
// PUSH26 0xa26469706673582212203a05097003697b26b1da819218bcd957
// PUSH26 0x598eaa90539e82a59ecbe4c09757e364736f6c63430007000033

By chance (really) this auxdata of 53 bytes is split into two exactly from the middle but this doesn’t have to be the case. Remember the large middle portion of the CBOR encoding contains the IPFS hash so one can salt and iterate it.

Imagine the source code of the attacker compiles to the code below. Putting new lines to demonstrate the (allegedly) auxdata part:


This is what we get from the source code the attacker gives us to verify. So we go: “Oh right there's an auxdata a26469706673582212203a05097003697b26b1da819218bcd95779598eaa90539e82a59ecbe4c09757e364736f6c63430007000033 in this bytecode. We should ignore the corresponding part in the (onchain) bytecode to have a partial match.”

Oops now we are ignoring a part in the bytecode that we're not supposed to. These code parts are only meant for non-executable code whereas we embedded this with an assembly block.

In the attacker’s onchain bytecode (what actually will be executed vs. the verified code) the attacker could have placed anything in this assembly block for 53 bytes. I leave it up to your imagination what can be done with this ignored bytecode block.

The gist is, we need to make sure these to-be-ignored blocks are actually auxdatas and not coming for an executable code block. How do we do it?

The solution(s)

Well, we know that the IPFS hash inside the auxdata is the hash of the metadata file and the metadata file contains the source file hashes. So we can touch all source files to change their hashes, e.g. by adding a whitespace at the end of each. By touching every single source file, we make sure the nested auxdatas will be modified as well. If we compile again, we will have the exact same bytecode just with differences at the metadata hashes. Then we can locate the metadata hashes by comparing the original and edited bytecodes side by side.

But we need one more thing: Now we know where the metadata hashes are but that is just a substring of the whole CBOR auxdata. So we need to figure out where the CBOR auxdata starts and ends.

Blockscout solution

One way to do this is to start at the metadata hash positions we've found by comparing and go extend the byte substring byte-by-byte and each time try to decode the whole byte string in CBOR. If at one point successful, we know that the auxdata ends here. Remember that right after the CBOR encoding you'll find the length of the encoded part, so we know where it starts as well.

Indeed this is how Blockscout finds the auxdata positions.

Sourcify solution

The way we approach this in Sourcify is by again making use of the legacyAssembly.

These are roughly the steps:

  1. Use bytecodes: Compare the original bytecode to the whitespaced (edited) contract’s bytecode. This will give us the positions of the metadata hashes, remember not the whole auxdata.
  2. Use legacyAssembly: Compare the auxdatas from legacyAssembly s of both contracts. We will get a auxdataDiff between each auxdata (1st auxdata in original vs 1st in edited etc.). The diff will not exactly be the whole metadata hashes because CIDv0 IPFS hashes start with Qm but the rest of the hash. The other parts of the auxdatas will be the same. We also keep the position of the diff inside the whole auxdata diffStart:
    interface AuxdataDiff {
    real: string;
    diffStart: number;
    diff: string;
  3. Remember these are the metadata hashes. If they are equal, we can now find where the whole auxdata starts with:
    for (const position of positions) {
    for (const auxdataDiff of auxdataDiffs) {
    // Compare if the diff from raw bytecode is equal the diff from `legacyAssembly` auxdatas
    if (editedBytecode.substring(position + auxdataDiff.diff.length) === auxdataDiff.diff)
    return originalBytecode.substring(position - auxdataDiff.diffStart, position + auxdataDiff.diff.length);


0x6080... CBOR auxdata 1909117905579a26469706673582212203a05097003697b26b1da819218bcd95779598eaa90539e82a59ecbe4c09757e364736f6c6343000700003352565b5f6a636f6e736f6c652e6c6


0x6080... CBOR auxdata 1909117905579a2646970667358221220dceca8706b29e917dacf25fceef95acac8d90d765ac926663ce4096195952b6164736f6c6343000700003352565b5f6a636f6e736f6c652e6c6

└──────────────────┘↑ diffStart position

An Alternative

  1. Start with a string search inside the bytecode for the auxdatas from legacyAssembly of the contract. Now we have the positions of potential auxdatas of the original contract.
  2. Next we whitespace the source files and compile the contract again. Let’s call it the edited contract.
  3. Finally we check if the bytecode substrings from the original contract and the edited contract have changed at the positions we found at the 1st step. We expect these to change if they indeed contain a real auxdata and not some custom bytecode.

Thanks to Rim from Blockscout for pointing out this alternative.

Making life easier for verifiers

To avoid doing all these nitty workarounds we just proposed the Solidity compiler to output the positions of the auxdatas, similar to the immutableReferences field:

We are still going to need to do this for the compiler versions before this gets implemented but still it would be less work in verification, particularly not having to compile contracts twice.

Since we edited the original source code with whitespaces and compiled the contract, we also have the legacyAssembly for the edited contract, which contain auxdatas. If we compare all the auxdatas extracted from legacyAssembly s of both, we will get a diff of each auxdata field which will be the metadata hashes. The rest of the auxdatas will be the same.

We Need to Talk About the On-Chain Metadata Hash

· 8 min read


Solidity compiler has a feature, not known by everyone, that appends the IPFS hash of the contract metadata to the contract bytecode. This hash effectively acts as a fingerprint of the compilation, and when deployed, goes onchain. With that, we can verify the contracts "perfectly" and fetch the contract source code from IPFS. One of our missions at Sourcify is to make this feature more known and used, but not everyone is a fan of it.

(If you don't fully understand the metadata hash check out our playground to see it in action.)

I argue this is the only foolproof way to verify contracts. Languages and tooling should come together and come up with a common standard. We should look back at what worked and what didn't, and come up with a better next version.

Runtime code vs Creation code

In source-code verification you compare a bytecode to a high-level code (Solidity, Vyper).

When you compile a contract you get two bytecodes:

Runtime bytecode is the code of the contract living on the blockchain. This is what really gets executed when you call a contract. You'll find it if you look at the bytecode of an unverified contract in a block explorer or when you call eth_getCode(address) on the contract.

Creation bytecode is the code that will be executed by the EVM when the contract is being deployed, which will store the runtime code at contract's address.

Since the terms are not well defined, some terminology:

  • "code" = "bytecode" in this context. Sometimes people just call it "runtime code", or "creation code".
  • "Init code" = "Creation bytecode". This is usually used in create2 context.
  • "Deployed Bytecode" = "Runtime Bytecode". This is another common way to refer to the runtime bytecode by the Solidity compiler and frameworks. I refrain from using this as sometimes the contract is not deployed and "runtime code" is more accurate.
  • evm.bytecode = "Creation bytecode". The Solidity compiler refers to it as this in the output.
  • evm.deployedBytecode = "Runtime bytecode". Same as above.

Which bytecode?

Let's go back to the source code verification. The problem we are trying to solve is we have a contract, and we want to see the original source code of it. Because we humans, can't really read bytecodes.

However, a contract has two bytecodes, which one should we compare the source code to?

Verifying with Creation Bytecode

One can say that the bytecode counterparty of a contract written in a high level language is the creation bytecode. Because, in a typical contract deployment this is what you give to the EVM to execute.

The problem with the creation bytecode is that it's not always stored onchain. The only time you see this is when you deploy a contract from an Externally Owned Account (EOA) by putting the creation bytecode in the and setting the receiver to null. In that case you'll see the creation bytecode if you look at the transaction.

However, for contracts created by other contracts (e.g. factories) it is executed once and then discarded. So someone needs to index and save the creation bytecodes somewhere and you need to trust them. Whereas the runtime bytecode is stored onchain and you can request it from your node with eth_getCode.

On the other hand, the creation bytecode of a contract is not necessarily what the compiler outputs. The creation bytecode can be any code that will execute and store the runtime bytecode at the contract address. See @ricmoo's CREATE2 example. He demonstrates how to deploy and SELFDESTRUCT a contract, and finally deploy a completely different contract at the same address, even though CREATE2 addreses depend on the init code. In this case the init code is the same but it dynamically gets and writes the contract code from somewhere else. If you change the code where it's dynamically fetched from, you deploy a different contract at the same address. So for this contract, even if we knew its original source code, we can't compile and compare against its creation code.

Verifying with the Runtime Bytecode

The runtime bytecode is the actual code of the contract and is readily available at eth_getCode. The compiler also outputs the runtime bytecode so one can verify contracts with the runtime bytecode too. With that, you can easily verify a contract on the "edge" (i.e. on your machine) trustlessly by getting the bytecode from your execution client.

The compiler output can be different than the onchain one as during deployment the runtime bytecode can be modified by writing the immutable values and the linked libraries in the placeholders. It's ok because, for Solidity, the compiler outputs the immutableReferences and libraries have a __$ placeholder, so we know where these are positioned in the bytecode.

The problem is, not everything in high-level contract code is represented in the runtime bytecode. Imagine this contract excerpt:

    constructor() {
owner = msg.sender;
emit OwnerSet(address(0), owner);

I can deploy this contract but verify it with a slightly different contract with the following constructor, which can have huge implications:

    constructor() {
owner = tx.origin;
emit OwnerSet(address(0), owner);

This is because this constructor code part will not be included in the runtime bytecode, and the owner value is not stored inside the bytecode but in the contract's storage.

Verifying with the Runtime Bytecode + Metadata Hash

There's a way around this problem. If you verify a contract with its metadata hash appended to the runtime bytecode, you'll get a full match. This means the source code you are looking at is exactly the same as the one that was originally compiled, because if you change anything about the contract (even a whitespace), the metadata hash will change and you will not get a "full match" but a "partial match".

This, I'd argue, is the only foolproof way to verify a contract's source code. This method covers all the cases above and the ones I haven't mentioned or we don't know about yet. By being based on the runtime code, this also removes the need to trust a third party to index the creation bytecode, and instead you can get the bytecode from your own execution client's JSON RPC interface.

Problems with the Metadata Hash

The main critisism of this feature is that the hash is too sensitive. It's both a bug and a feature that the hash changes even with a whitespace change.

A bigger problem is with the paths of the .sources.

"sources": {
"myDirectory/myFile.sol": {
"keccak256": "0x123...",
"license": "MIT",
"urls": [ "bzz-raw://7d7a...", "dweb:/ipfs/QmN..." ]

The keys here are actually not file paths but source-unit names, meaning they can be arbitrary strings. This is especially a problem for projects deploying with CREATE2, where the address of the contract depends on the init code. Any difference in "path" will be a different metadata hash --> diferent bytecode --> different contract address. As a result, most of them just turn off this feature.

It's a bigger problem if the same codebase does not compile to the same bytecode on different platforms. The differences caused by comments/whitespaces are not that big of a deal if we can verify contracts at the deployment pipeline i.e. right at the point when they are deployed. This also means we need to stop flattening contracts. Ideally you never drag and drop any files to a website, but use a verification plugin on your tooling (Foundry, Hardhat) or IDE (Remix). No medium size contract would manually be verified.

What would be a more clever way to do this? If we are able get this right, we solve most of the problems.


The two bytecodes associated with a contract are not always sufficient to correctly verify a contract. The only foolproof and decentralized way to do it is to use the runtime bytecode with the metadata hash appended to it. I believe this needs to be the default way to verify contracts, and only when you can't do it (like this bug), you should fall back to the partial match. Although at Sourcify we base our verification on this, most of the ecosystem don't make the partial vs full match distinction or are just aware of it.

As an outcome of this article I'd really want to see:

  1. Other cases where a runtime bytecode or creation bytecode fails to correctly verify a contract.
  2. Counter-arguments to the usefulness of the metadata hash.
  3. Clever ways to mitigate the problems with the metadata hash.
  4. Languages other than Solidity adopting this feature, and coming up with a standard for it.

Do have anything to add for these points above? Please reach out to me on Twitter or add your remarks in the discussion issue for this article (I'll link). I'll also be updating this article with the feedback I get, and be linking to discussions. This will be a living document.

Human-Readable Transactions Working Group

· 4 min read

Human-readability of Ethereum Transactions is a multi-faceted and complex problem that requires ecosystem-wide collaboration. Therefore, it makes sense to create a working group to gather people, projects, and knowledge.


It is a well-known UX problem in Ethereum that users usually don't/can't verify the action they are about to take, because they are not presented with human-readable information. This has led to social engineering hacks where victims lost millions. In one case, a hacker was able to replace the browser wallet, which made the victim sign a transfer transaction on his HW wallet that sends all the tokens to the hacker. In another, the hacker created an offline signature for the victim to list all his NFTs for free.

As a basic example, our goal is to show something similar to the one on the right rather than on the left.

Bytecode vs Human-Readable Tx

Nowadays, many wallets can do the basic ABI decoding and show a verified contract link but users still lack a description of the action they are about to take and additional safety information about the contract they are going to interact with.

How we achieve this at Sourcify is through the NatSpec documentation. If you document your code using NatSpec's @notice and @dev fields and fully verify your contract on Sourcify, the wallet can show the users the description you wrote when calling the function. (details in this talk at Devcon VI or this lightning talk).

Over time it became clear to me that even if we convince the majority of developers to document using NatSpec and fully verify on Sourcify, this single route won't solve this wicked problem of Human-readable Transactions. The problem is multi-faceted and requires different approaches for different cases. For instance, you can't add NatSpec docs to an already deployed contract, or you can't use Dynamic Expressions for a commit-reveal transaction (e.g. ENS commit).

Actually, there are different approaches, some of which we gathered in the Sourcify docs. Unfortunately, most of them seem to be stale.

Another motivation for us has been the lack of knowledge of what's going on in the space. Even though we were working on this problem, we haven't been aware of the following for a long time:

Solving this problem of transaction human readability is hard and is and requires ecosystem-wide collaboration.

For this reason, it makes sense to form a "Human-Readable Transactions Working Group" focused on this specific problem with different interested parties


How do we define the scope?

Our starting point is the human-readability of the transactions but this really cannot be separated from the safety, UX and human-friendliness. Depending on the progress, other UX and safety aspects are expected to be included in the general work (audits, token registries etc.). Initially, it's called “human-readable tx's WG”, but we'll see where it goes.

The work will mostly be on EVM, but not specific to the Ethereum network.


  • 🎯 Being the Schelling Point: Gather different parties working on the transaction readability, security, and UX in the same place. Enable collaboration between parties, and make sure everyone knows who's working on what.
  • 📚 Being the knowledge base: Discuss and compile the different approaches to the problem. Lay out the advantages and disadvantages of different methods. Document them for the public.
  • 🌟 Open-source the solutions to solve it once and for all.

The goal, however, is not to work on a single agreed solution to the problem. As said, there is no single solution to this problem due to its complexity and context dependence. Likely, there will be conflicts and forks, and each team will focus on what they think is the best way. Ideation and active feedback should allow us to reach the best solutions faster.


This is also a TBD but one potential place for this WG is CASA.


Are you working on similar problems and want to collaborate? Reach out to me on Twitter @kaanuzdogan, Matrix, or Telegram (@kuzdogan)!

Sourcify v2

· 3 min read

Today we released Sourcify v2 🎉

The changes do not affect the Sourcify Server API in a non-backwards compatible way. If you are using the Sourcify API you don't need to worry. However are some non-breaking additions detailed below.

Why is this a major update then?


The motivation for these changes is to make Sourcify verification more reusable. The lib-sourcify package can be imported into other projects and verify a contract given the source files, and chain&address. Another goal was to create modularity in the codebase with more separated concerns. With these changes, Sourcify server consumes the core lib-sourcify functionality, and takes care of the rest: providing an API, validating inputs, and storing the results (in the repo) etc.

This is in line with what we want to achieve with edge verification. We beleive a contract verification should be easily reproducable and you should be able to verify contracts locally without relying on a third party.

Imagine you're interacting with a contract on your wallet. Before you sign a transaction your wallet:

  • fetches the contract's source code from IPFS
  • compiles and verifies with lib-sourcify

without even talking to Sourcify or any other verifier, everything happens on your local machine. Similarly a block explorer like Otterscan can give its users the option to either fetch the verified source code directly from a verifier (like Sourcify), or verify the contract locally on the frontend.

However, the library as is it not compatible with browsers yet and we are working on it. If you are knowledgable on this front and want to help us, please reach us out.


The brand new @ethereum-sourcify/lib-sourcify is the library that will do all the weightlifting of assembling a contract (e.g. source files) into a compilable CheckedContract, compiling, and verifying it. You can pass checkFiles your contract source code and metadata.json to pack compilable CheckedContracts.

const pathBuffers: PathBuffer[] = [];
path: filePath,
buffer: fs.readFileSync(filePath),
const checkedContracts: CheckedContract[] = await checkFiles(pathBuffers);

Then you can verify this CheckedContract against a contract that is deployed on a chain at an address.

const goerliChain =   {
name: "Goerli",
rpc: [
chainId: 5,

const match = await verifyDeployed(

console.log(match.status) // 'perfect'

Creator Tx Hash

We can also verify contracts by looking at the tx.input of the transaction that created the contract. If this matches the creation bytecode of the compiled contract AND the address resulting from the tx.from and tx.nonce matches the given address, we can verify the contract.

const match = await verifyDeployed(
"0xe75fb554e433e03763a1560646ee22dcb74e5274b34c5ad644e7c0f619a7e1d0" //tx hash

(In the server API, find the field creatorTxHash)


You can also verify CREATE2 created contracts:

const match = await verifyCreate2(

console.log(match.chainId); // '0'. create2 matches return 0 as chainId
console.log(match.status); // 'perfect'

Questions? Feedback?

As usually feel free to reach us out on Twitter, Matrix chat, or Discord.

✅ Happy verifying!

Verify Contracts Perrrrrfectly: Why and How?

· 7 min read

In an ecosystem with the core values of transparency, security, and trust (and trustlessness); it is expected from all contract developers to publish their source code. If you're even slightly familiar with Ethereum, there is no need for further explaination.

But if I give you a source code, how do you make sure the published source code really is the source code of the contract? That's where source code verification comes into play.


Throughout this article and 99% of the time in Sourcify context, by verification we will be referring to smart contract verification. Verification sometimes also refers to formal verification.

What is source code verification?

First thing first, all the smart contracts on blockchain are stored in bytecode. Just like our physical machines that only speak bits and bytes, Ethereum Virtual Machine also only understands bytes. If you ask the Ethereum blockchain the code of a contract, you only get a byte string.

So, let's say I give you a contract in Solidity and claim that this is the code behind the contract at "0xabcdef...". To verify, you need to make sure this code compiles to the same bytecode as the claimed contract at "0xabcdef...". This is the basic idea behind the smart contract verification: we compile a contract and check if the bytecode matches the one on blockchain.

Visualization of the compilation of a contract Checking if bytecodes match

You have probably made use of contract verification before. For many users this is the green checkmark in Etherscan:

Green checkmark in a verified contract page on Etherscan

You see the green checkmark and you are happy!

But is it really exactly the same code that is deployed?

The answer is, you don't know 🤷

In fact, no one else would be able to know except the contract developer, and he/she can't really prove it. The reason is, when compiling the contract i.e. translating the human-readable source code (in Solidity or any other higher-level language) to machine-readable bytecode, some information is lost. These include internal variable names, internal function names, names of contracts etc.

So yes this is functionally the same code as deployed: it compiles to the same bytecode as the original mysterious source code 🕵.

And you might be thinking, sure this is good enough. But:

  • Someone can insert misleading comments, (internal) function or variable names
  • Whoever verifies a contract first is chosen as the matching result, not the "authentic" one
  • We can't verify things other than the contract's code itself (i.e. metadata)

In fact when not verified properly, it is possible to inject code that would be shown in the verified source code.

Enough bad news... There's actually a way to verify Solidity contracts that would cryptographically ensure the exactness of the source files and it is already here: It's called Sourcify!

This way of verifying contracts is what we call a perfect verification, (in contrast to partial verification). This is enabled by the Solidity contract metadata, and that the hash of it is appended to the contract's bytecode. The metadata hash acts as a fingerprint of the whole compilation and with the information in the metadata file we can completely reproduce the contract compilation.

Contract Metadata

The Solidity compiler by default appends some information to the contract's bytecode in CBOR encoding. This special field, I like referring to as auxdata, usually contains the "Solidity version", the "metadata hash", and occasionally the "experimental" flag. The encoded data and it's decoding looks like this:

Decoding of the auxdata appended to the bytecode

You can actually inspect this field and see the decoding in action for any contract in

To see how this cryptographically ensures the exactness of the source files we need to look into the contents of the metadata file. The metadata file is a JSON document that looks like this and contains information on two things:

  1. How to interact with the contract: ABI, documentation
  2. How to reproduce a contract compilation: compiler version and settings, source file information

The latter is the relevant field for our purposes. Specifically, the fact that the metadata file contains source file hashes. To illustrate this, let's walk through what happens when you compile a contract and what happens when you change a source file.

When you compile a contract, the compiler computes the hashes of the source files and embeds this information in the metadata file. On the right side, you see the relevant fields of the metadata file:

Embedding of the hash of the source files inside the metadata file

Then the compiler takes the hash of this whole file:

Taking the IPFS hash of the metadata file

And encodes it in the auxdata at the end of the bytecode:

Encoding of the metadata hash at the end of the bytecode

So if you were to decode the auxdata you'd see:

Decoding of the metadata hash at the end of the bytecode

What happens when we change something in the source files? Say we change a variable name or a comment in the new MyContract-diff.sol file. In turn the hash of the file changes, as well as the hash in the metadata:

The change of the hash when the source file changes

...and of course the hash of the metadata file changes:

The change of the hash when the metadata file changes

...and the auxdata changes:

The change of the auxdata when the hash changes

Sooo, if we match both the bytecode + the appended auxdata, we have byte-by-byte exactly the same source code and compilation settings of the original deployed contract. This is a perfect verification.

The perfect verification

If the bytecode matches but not the auxdata (which includes the metadata hash), we have a partial verification.

The partial verification

Did you notice?

If you are familiar with IPFS and paid attention, you might ask: Can't we already get everything from the bytecode itself?

And yes, if published on IPFS, you can actually fetch the source code from the bytecode of a contract, because all the information is already there:

  • The metadata IPFS hash is appended to the bytecode so (if published) you can fetch the metadata file.
  • The metadata file contains (alongside the normal keccak256) the IPFS hashes of the source files so you can fetch the complete source code from IPFS.

So there's only one thing that you need to do as a contract developer: Publish your source files and metadata on IPFS.

Why do you need verification then? Isn't the source file already out there?

Although unlikely since the compiler does it automatically, someone can change the auxdata of the contract before deploying it and show you a different random source code. We make sure it really is the same code by doing a whole recompilation of the provided files and comparing the resulting bytecodes. Plus, we share all verified contracts in our repository on IPFS to make sure it's available.


Perfect verification enables more secure and transparent verification on contracts, as well as other useful things such as decoding tx's and enabling human-readable contract interactions, but this is a topic for another article.

Next level smart contract verification is already here. We just need to adopt this way of verifying contracts as a community. Obviously, we need a lot of tooling, integrations, and more awareness. Let's step up and make this the standard way of verifying contracts!

(This article is a summary of my recent talks about Sourcify. If you are interested in learning more, check out one of the latest talks)