Vanity Ethereum Addresses

This will be a short post that goes over how to generate so-called “vanity” Ethereum addresses. The implementation will be in Go, but the principle can probably be applied in any programming language that has a secp256k1 library (either by bindings to the C implementation by bitcoin-core or some other implementation) and an implementation of the Keccak-256 hash function.

Primer on Ethereum Addresses

The Ethereum ecosystem inherits the elliptic curve used by Bitcoin called secp256k1 (I’ve heard it most often pronounced “seck-p-256-k1”). This curve provides the cryptographic group that allows users to sign transactions on both Bitcoin and Ethereum. Technically speaking, the same (private, public) key-pair can be used for both Bitcoin and Ethereum, but the addresses that are accepted by wallets of each ecosystem are different.

There are two steps to generating a (private, public) key-pair for any public-key cryptosystem. The private key is generated first, and the public key is generated using the private key, in such a way that it is computationally infeasible to derive the private key from the public key.

To make this concrete, for secp256k1:

  1. Use a CSPRNG (stands for “cryptographically secure random number generator”) to generate 256 random bits (i.e 32 random octets). This is the private key, denoted s. Note that the interpretation of this is that it is an unsigned 256 bit number.

  2. Perform the multiplication P = s * G where G is the so-called “generator point” of the secp256k1 elliptic curve. (See here among other good posts to learn more about elliptic curve cryptography). The result P is a point on the elliptic curve, i.e it is a pair of unsigned 256 bit numbers, although implementations may use more space in order to achieve constant-time integer arithmetic for this bit size.

The above two steps are performed for both Ethereum and Bitcoin (and really, any application that uses secp256k1). At this point, however, things start to diverge. This is mainly due to the fact that Ethereum and Bitcoin use different hashing algorithms - Bitcoin uses SHA-256 while Ethereum uses Keccak-256.

For Ethereum, the remaining steps are as follows:

  1. Serialize (or in Go terminology, marshal) the public key point retrieved in step (2) in uncompressed form (see https://www.secg.org/sec1-v2.pdf, Section 2.3.3, follow step (3) for uncompressed point serialization).

  2. Take the Keccak-256 hash of the serialized public key from step (3), excepting the first byte (the first byte is 0x04 in hex, that represents an uncompressed elliptic curve public key - this is so that applications that deal with public keys know how to deserialize compressed vs. uncompressed elliptic curve public keys). In Go-style pseudocode, this would be keccak256(serializedPubKey[1:]).

  3. Take the last 20 bytes of the Keccak-256 hash - that is your Ethereum address.

Of course, these 20 bytes are then encoded into a hexadecimal string representation and prefixed by 0x, which is one way to express a hexadecimal string.

Contract Addresses

As you may already know, Ethereum doesn’t only have so-called “externally owned accounts” or EOA’s - accounts that are generated by the procedure defined above. There are also contract accounts. These accounts have two things that regular EOA’s don’t: code (in the sense of code that can be executed by the Ethereum Virtual Machine (EVM) - EVM bytecode), and storage (i.e smart contract state).

Contract addresses are deterministically generated from a source address - this is the address of the deployer, whether it be a smart contract or an EOA (yes, smart contracts can deploy other smart contracts!), and a nonce - this is a number that is used only once (number once) and never again, and you’ll see why when I detail the procedure below.

  1. Encode the array [sourceAddress, nonce] into recursive-length prefix encoding. How this works is out of scope of this post and would probably derail it into a lot of not very interesting details, but see here if you’re interested in the details.

  2. Take the Keccak-256 hash of the encoded bytes obtained in (1), and take the last 20 bytes. Those bytes represent the contract address.

Note that due to the hardness of the discrete logarithm problem for elliptic curves AND the collision resistance properties of Keccak-256, it is computationally infeasible to obtain the private key associated with this address

Narcissus Called: He Wants a Nice Ethereum Address

SIDE NOTE: narcissus would be a cool name for a blockchain.

We can think of Ethereum addresses kind of like a car’s license plates - they unambiguously refer to a single vehicle (at least, in the eyes of the law) and having a easy-to-remember license plate, much like an easy-to-remember phone number, is a way to set yourself apart from “the rest”.

Luckily, unlike phone numbers, you can generate (almost) any Ethereum address you want, as long as you’re willing to dedicate the compute for it.

Also unlike phone numbers, we can’t just pick any 20 bytes and work backwards to get our private key - you still have 12 bytes missing from the Keccak-256 hash, and even if you did permute all possible 12 byte combinations, you still have to get the pre-image of that hash, AND that pre-image has to be a valid public key on the secp256k1 curve. And if you get that far (which I doubt you will) you still have to solve the discrete logarithm problem to get the private key associated with that public key, which is computationally infeasible for elliptic curve groups like secp256k1.

OK - so you can’t just pick any address you want. Well then how do you generate an address you like?

It’s really the simplest approach there is - generate and check!

Specifically, you might have to generate and check 100s of thousands (maybe millions even, depending on what kind of address you want) of (privae, public) key-pairs in order to get the address you want.

Here’s what the code could look like in Go:

import (
	"log"

	"github.com/ethereum/go-ethereum/common"
	"github.com/ethereum/go-ethereum/crypto"
)

type out struct {
	address common.Address
	priv    []byte
}

func newAddress() out {
	privateKey, err := crypto.GenerateKey()
	if err != nil {
		log.Fatal(err)
	}

	privateKeyBytes := crypto.FromECDSA(privateKey)
	address := crypto.PubkeyToAddress(privateKey.PublicKey)
	return out{
		address: address,
		priv:    privateKeyBytes,
	}
}

Now you just need to run newAddress enough times and inspect the address output and see if it matches your need.

The most common vanity conditions are prefixes and suffixes that have certain combinations of letters and/or numbers. For example, having 5 leading 0’s in your address, so that it looks like 0x00000... is a common ask for a vanity address. The more zeros you want to have in your prefix, the longer it’ll take to find it.

Speeding Things Up

The “generate and check” method sounds suspiciously like a mining algorithm - increment a nonce and hash that nonce with some block data and see if your hash has enough leading zeros (the “difficulty”). The biggest difference here is that we don’t win any ether if we generate the address we like.

The thing that these miners are able to do is execute many hashes in parallel. For example, this bitcoin miner can generate 255 trillion hashes per second. The reason they are able to do this is because they run specialized hardware circuits to execute e.g SHA-256. It’s unlikely you’d spend all that power to run such specialized hardware just to generate a vanity address, though (who knows, I may be surprised :-)).

The thing about Go is that it makes writing concurrent programs easy. So lets give it a shot.

First, we start with a producer goroutine:

func producer(o chan out, done chan struct{}) {
	for {
		select {
		case <-done:
			return
		default:
			output := newAddress()
			o <- output
		}
	}
}

This goroutine is responsible for generating new Ethereum addresses and piping them out to a channel.

And then there’s the consumer:

func consumer(o chan out, done chan struct{}, desiredPattern, patternPosition string) out {
	var step uint64
	for {
		select {
		case data := <-o:
			if isValidVanityAddress(data.address, desiredPattern, patternPosition) {
				fmt.Println("steps taken:", step)
				return data
			}
			if *verbose && step%10000 == 0 {
				fmt.Println("step:", step, "latest addr:", data.address.Hex())
			}
			step++
		case <-done:
			return out{}
		}
	}
}

The consumer reads off the channel written to by the producer, and checks if it’s a valid vanity address. The validity of the address depends on what you want it to be. To keep things simple for this post, desiredPattern is not a regular expression, and is expected to consist of valid hexadecimal characters, and patternPosition is either "prefix" or "suffix" to indicate the position of this pattern. Of course, you can make this as complex as you like here, it’ll just cost you more compute cycles.

Here’s a simple implementation of isValidVanityAddress:

func isValidVanityAddress(a common.Address, desiredPattern, patternPosition string) bool {
	if patternPosition == "prefix" {
		return strings.HasPrefix(
			strings.ToLower(a.Hex()[2:]), // EIP-55
			desiredPattern)
	} else {
		return strings.HasSuffix(
			strings.ToLower(a.Hex()[2:]), // EIP-55
			desiredPattern)
	}
}

We call strings.ToLower because the Hex method on go-ethereum’s Address returns a checksummed address (see here for details).

Finally, here’s the code that’ll spawn all the goroutines and brings things together:

var (
	numWorkers      = flag.Int("num-workers", runtime.NumCPU(), "number of workers to use. for generate only.")
	desiredPattern  = flag.String("desired-pattern", "", "desired prefix or suffix pattern.  for generate only.")
	patternPosition = flag.String("pattern-position", "prefix", "whether the pattern should be prefix or suffix.  for generate only.")
	outFile         = flag.String("o", "out.txt", "output file to store info in")
	verbose         = flag.Bool("verbose", false, "whether to log progress or not")
)

func main() {
	flag.Parse()
	out := make(chan out, 1_000)
	done := make(chan struct{})
	for i := 0; i < *numWorkers; i++ {
		go producer(out, done)
	}
	o := consumer(out, done, *desiredPattern, *patternPosition)
	f, err := os.Create(*outFile)
	if err != nil {
		log.Fatal(err)
	}
	defer f.Close()
	f.WriteString(fmt.Sprintf("address: %s\n", o.address.Hex()))
	f.WriteString(fmt.Sprintf("private key hex: %s\n", hexutil.Encode(o.priv)))
}

Alright, let’s run it! I’m going to give it a vanity address condition of 5 zeros and see how long it takes!

$ go run main.go -cmd generate -desired-pattern 00000 -o test1.txt
steps taken: 363091

The output was the address 0x000007A1F37842542037C20cb6Ea2f3De0D28Bad. Nice! Lets try something else:

$ go run main.go -cmd generate -desired-pattern dead -o test1.txt
steps taken: 223813

dead is actually a valid hex string - it’s used as a burn address in Ethereum, to remove tokens from supply permanently. In case it’s not clear, don’t send your ether (or your ERC20 tokens, or your NFT’s) there, you won’t be able to get it back :-).

The address I got from the above command is 0xdeADa0c566Cc399Ca543299206929DcD8E9557d1.

And just one more, for some nice steak:

$ go run main.go -cmd generate -desired-pattern beef -o test1.txt
steps taken: 100453

Address is: 0xbEeF6293cFC85DDd93dF291A9274B10AA2021ec4. Yummy!

Quick Maths

So, exactly how many iterations are needed to get an address string that you like? Some quick maths here will show you how quickly the iteration count blows up as you increase the length of your prefix or suffix.

Each character in a hex string can take on one of 16 possible characters - 0-9 for the digits and a-f for the letters. In order to match n characters, then, that is 1 combination out of 16^n combinations. So how many tries do we need, on average, to get the combination that we need?

We can use the geometric distribution to figure this out. We can think of each address generation try as a “trial”, and then we can denote 1 / 16^n the probability of getting the prefix/suffix that we like for the vanity address. Then, we can write

P(X = k) = (1 - 1/16^n)^{k-1} * (1 / 16^n)

Now we just need to take the mean - or the average - of the geometric distribution. This is also known as the “expected value”. We’ll leave the derivation of the formula to the reader (or you can always check it online) but we’ll just state it here:

E(X) = 1 / (1 / 16^n) = 16^n

A more useful number though is the median. For the geometric distribution, the median is defined to be:

M(X) = -1 / log2(1 - 1/16^n)

For e.g n = 5 this gives approximately 726817 - so around that many trials are probably needed to get the vanity address desired. Of course, you could get lucky and do it in much less tries, or you could get unlucky and have it take more tries, but in the end, that is due to the “randomness” in the private key generation.

Conclusion

Anyway, hope that was interesting. So long!

Written on March 26, 2023