Sourcify Database
Sourcify Database is the main storage backend for Sourcify. It is a PostgreSQL database that follows the Verified Alliance Schema as its base with few modifications.
On a high level, these modifications are:
- Sourcify DB does accept contracts without the deployment details such as
block_number
,transaction_hash
as well as without an onchain creation bytecode (contracts.creation_code_hash
). - Stores the Solidity metadata separately in the
sourcify_matches
table. - Introduces tables for other purposes.
You can follow the services/database/migrations
folder for the initial schema and the changes made to it. These are not necessarily the differences between Sourcify DB and the Verified Alliance Schema, but any changes made to the schema over time.
Schema
You can access the live schema of the database here or in the embedded frame below.
In short:
- Every verified contract is a coupling between a deployed contract (
contract_deployments
) and a compilation (compiled_contracts
) - "Transformations" are applied to reach the final matching onchain bytecode from a bytecode from a compilation.
- Contract bytecodes are "normalized" for deduplication. A bytecode of a popular contract like
ERC20.sol
will only be stored once.
For more information about the schemas of the json fields below check the Verifier Alliance repo.
JSON fields of verified_contracts
table:
creation_values
creation_transformations
runtime_values
runtime_transformations
The transformations and values are the operations done on a bytecode from a compilation to reach the final matching onchain bytecode.
JSON fields of compiled_contracts
table:
sources
: Source code files of a contractcompiler_settings
compilation_artifacts
: Fields from the compilation output JSON. Fields:abi
,userdoc
,devdoc
,sources
(AST identifiers),storageLayout
creation_code_artifacts
: Fields underevm.bytecode
field. Fields:sourceMap
,linkReferences
,cborAuxdata
runtime_code_artifacts
: Fields underevm.deployedBytecode
field. Fields:sourceMap
,linkReferences
,cborAuxdata
,immutableReferences
Download
We dump the whole database daily in Parquet format and upload it to a Cloudflare R2 storage. You can access the manifest file at https://export.sourcify.dev ( .dev
redirects to .app
domain, which also belongs to Sourcify). The script that does the dump is at sourcifyeth/parquet-export.
export.sourcify.dev will redirect to a manifest.json
file:
manifest.json
{
"timestamp": 1726030203254,
"dateStr": "2024-09-11T04:50:03.254904Z",
"files": {
"code": [
"code/code_0_100000.parquet",
"code/code_100000_200000.parquet",
...
"code/code_2700000_2800000.parquet"
],
"contracts": [
"contracts/contracts_0_1000000.parquet",
...
"contracts/contracts_4000000_5000000.parquet"
],
"contract_deployments": [
"contract_deployments/contract_deployments_0_1000000.parquet",
...
"contract_deployments/contract_deployments_5000000_6000000.parquet"
],
"compiled_contracts": [
"compiled_contracts/compiled_contracts_0_5000.parquet",
...
"compiled_contracts/compiled_contracts_815000_820000.parquet"
],
"verified_contracts": [
"verified_contracts/verified_contracts_0_1000000.parquet",
...
"verified_contracts/verified_contracts_5000000_6000000.parquet"
],
"sourcify_matches": [
"sourcify_matches/sourcify_matches_0_100000.parquet",
...
"sourcify_matches/sourcify_matches_5300000_5400000.parquet"
]
}
}
You can download all the files and use a parquet client to query, inspect, or process the data.
Download the manifest file (
-L
to follow redirects):curl -L -O https://export.sourcify.dev/manifest.json
Download all the tables listed in the manifest:
jq -r '.files | keys[] as $k | .[$k][]' manifest.json | xargs -I {} curl -L -O https://export.sourcify.dev/{}
For example you can install the parquet-cli
to do basic inspection:
brew install parquet-cli
parquet meta compiled_contracts_0_5000.parquet
alternatively use your favorite data processing tool or import this data into a database.