Fair Ransomware Protocol using Bitcoin

Introduction

Ransomware is one of the malicious programs (malware), which encrypts victim's data and requests money such as Bitcoins as ransom for decrypting the data stored in their computer. Recently, “WannaCry” has become popular worldwide. However, as a simple question, if a victim sends Bitcoins as ransom, will they be able to decrypt their encrypted data? In other words, current ransomware may make them lose both money and data. In this article, we assume that there are Alice who is an author of a ransomware and she want to get Bitcoins, and Bob who has been infected with Alice's ransomware and his data has been encrypted. He wants to decrypt the data even though he pays Bitcoins to Alice, but the following two kinds of malicious action are possible between Alice and Bob.

  1. Alice first received Bitcoins from Bob, although she will not send the key to decrypt the ciphertext which Bob has
  2. Bob gave the key to decrypt the ciphertext from Alice, although he will not send Bitcoins to her

We show that the fair ransomware protocol to prevent these malicious actions with a high probability. Also this protocol uses current Bitcoin's property and cryptographic techniques, so it does not require any trusted third party. In the first section I will explain cryptographic techniques which are necessary for protocol configuration, and I will introduce Bitcoin in next section, then in the next section we will consider the precondition of Alice and Bob. Next I will explain the details of the fair ransomware protocol, finally I will talk the conclusion and the references. Note that you can get the source code of this article from here.

Cryptographic Techniques

In this section, I explain cryptographic techniques for constructing the protocol.

Symmetric Key Encryption

Symmetric key encryptions are encryption schemes that use the same key for encrypting and decrypting. We can encrypt the any length data that is called a plaintext by the fixed length of the data that is called a symmetric key. We encrypt the data x using a symmetric key k, in this article it will be written as follows by the Enc as an encrypting function.

Enck(x)

Using a decrypting function Dec, we able to decrypt a ciphertext which is the result of encrypting. We can decrypt if the symmetric key used for decryption is equal to the symmetric key used for encryption. So the following equation holds.

x=Deck(Enck(x))

But, symmetric key encryptions do not guarantee that x=Enck(Deck(x)) which is that the order of encryption and decryption is switched. Symmetric key encryptions are faster than public key encryptions like RSA encryption which will be described later. AES is one of the most famous implementation of symmetric key encryptions. And recently ChaCha20 has been proposed. This article's protocol does not require a specific implementation of symmetric key encryptions, so we write Enc as the encrypting function of symmetric key encryption and Dec as decrypting function of that.

Hash Function

Hash functions output data of a fixed length for any length of data as input. For a hash function H, its return H(x) of data x is called a hash value. And for a hash value h:=H(m), m is called the preimage of the hash value h. Hash functions have the following properties.

Pre-image resistance
Given a hash value h it should be difficult to find any message m such that h=H(m).
Second pre-image resistance
Given an input m1 it should be difficult to find different input m2 such that H(m1)=H(m2).
Collision resistance
It should be difficult to find two different messages m1 and m2 such that H(m1)=H(m2).

In this protocol hash functions must be run on the Bitcoin scripts described later, so an implementation of the hash functions must be SHA-1, SHA-256 or RIPEMD-160. We write H as a hash function, which is a SHA-1, SHA-256 or RIPEMD-160.

RSA Encryption

RSA encryption is one of the public key encryption. Public key encryptions are encryption schemes that use two different keys for encryption and decryption, unlike symmetric key encryptions. We call a key used for encryption a public key, and a key used for decryption a private key. Using the public key in the RSA encryption as (e,N), encrypting a plaintext x is as follows.

xemodN

N is a product of different huge prime numbers p,q. If you know the prime factors p,q of N, you can easily obtain a private key d which holds the following equation.

x=(xe)d=(xd)e(modN)

Unlike symmetric key encryptions, you can restore the plaintext x even if you switch the order of encrypting and decrypting. This property is used in signatures described later. Those who do not know the prime factors p,q of N, it is very difficult to obtain the private key d from the public key (e,N) and the ciphertext xemodN. If N is 2048 bits or more, even attackers use supercomputers, the prime factorization takes enought time, so we think RSA encryption is safe. Next we will describe some features of RSA encryption.

Plaintext size
At the time of encrypting and decrypting, we calculate modN so that plaintext must be from 0 to N1. If you want to encrypt the plaintext whose size is larger than N1, you must divide the plaintext for each log2N bits.
Blind and unblind
A person who has a ciphertext can decrypt it whithout knowing the ciphertext. They blind a ciphertext and send to a person who has a secret key and decrypts it, then remove the blind (unblind).
Signature
RSA encryption can also be used for signing. Signatures are mechanisms to verify that certain data was created by a person who has a secret key (signature key) with a public key (verification key).

First, I will explain blind and unblind in detail. Now, Bob has a ciphertext c encrypted by a public key (e,N), Alice has a secret key d. But he does not want her to know the ciphertext c and a plaintext t which is the result of decrypting c.

  1. Bob chooses a random number r(1r<N), s:=cremodN and sends it to Alice
  2. Alice calculates ˉs:=sdmodN and sends it to Bob. In this case, ˉs is as follows

    ˉs=sd=(cre)d=cd(re)d=cdr(modN)
  3. Bob calculates t:=ˉs/r. Therefore, t=cdmodN

We can decrypt ciphertexts with keeping their secret by this way. Next I explain signatures in detail. For example, there is a data t, Alice wants to show that this data was created by her to Bob. She has a secret key d. In addition, Alice and Bob know a public key (e,N) and a hash function H.

  1. Alice calculates h:=H(t) and s:=hdmodN and sends t,s to Bob
  2. Bob verifies the H(t)=semodN. In this case, semodN is as follows:

    se=(hd)e=h(modN)

Alice can caliculate hdmodN because she only knows the private key d. So she can prove that the data t was created with her intention. Signatures are used in Bitcoin, but Bitcoin uses ECDSA instead of RSA because the efficiency is better than RSA. Nonetheless, in ECDSA we need not change the process of signing the data by a private key and verifying it by a public key.

Zero-Knowledge Proof

Zero-Knowledge proofs are ways to prove that the proposition is true or false without being known other things. I will explain one of the cut-and-choose protocol which is a kind of Zero-Knowledge proofs. As a concrete method used in the fair rasomware protocol is described in later section, in this section I describe the intuition of the cut-and-choose protocol. The protocol proves as follows.

Assuming that Bob has a data c, can Alice calculate the result of applying some function f to the data c?

Alice wants to prove that she can calculate f(c) to Bob, but she does not want Bob to know the details of the function f and the result f(c).

  1. Bob computes the n number data called Fake Values F. However, Fake Values are calculated from only the random numbers and public information, they must not contain data c which Bob has
  2. Bob computes the l number data called Real Values R. However, Real Values are calculated from the random numbers and public information and the data c
  3. Bob makes β to concatenate Fake Values F and Real Values R and shuffle it
  4. For i:=1,,n+l, Alice calculates as follows:
    • Using a random number ki, a ciphertext si:=Encki(f(βi))
    • A hash value hi:=H(ki)
  5. Alice sends s1,,sn+l and h1,,hn+l to Bob
  6. Bob sends Fake Values F to Alice
  7. Alice verifies that βi does not contain c for all iF
  8. Alice sends ki to Bob for all iF
  9. For all iF, Bob verifies hi=H(ki) and the decrypted value Decki(si) is the result of applying the function f

We consider how they prove by this protocol. In (3) Alice does not know what the value is in the Real Values or the Fake Values. We assume that Alice cheat, using another function g, and now she caliculates the following manner.

si:=Encki(g(βi))

However, in (9), if βi is one of the Fake Values, Alice must publish the symmetric key ki and the results of applying f to Bob. Thus Bob knows that Alice did not calculate correctly. Indeed Alice applies f only to the n number Fake Values, then she applies a different function that is not f to the l number Real Values. But a probability of Alice's cheat to be successful is only one in the combination chosing the n number of data from the n+l number data. Therefore the probability is as follows.

1(n+ln)

So by setting n and l properly we can extremely reduce the probability of her cheat's success. Also, when Bob uses the data c falsely for the Fake Values and sends these to Alice, Bob gets the result f(c) in (9). In order to avoid this Alice publishs a symmetric key ki after she verified the Fake Values sent from Bob in (7). This also prevents Bob's cheat. This cut-and-choose protocol does not perform on any functions f, but it works very well with RSA encryption.

Bitcoin

Bitcoin is a currency, unlike US dollar, which it does not have a central bank. Instead of a central bank, Bitcoin uses a blockchain, which is a ledger of transactions. The blockchain is managed by many people who are called miners. Miners verify each transaction. Only a transaction is verified successfully, it will be added to the blockchain. Anyone can become a miner and receives Bitcoins for verifying transactions. How do the miners verify a transaction? They run programs called scripts which are included in the transaction. The script is a very restrictive non-Turing complete stack-based programming language, which does not include loops and recursions. For example, consider the following script.

123OP_ADDOP_SUB

This works as follows. For the sake of clarity, we also show that the state of the stack after run each step.

  1. Push 1 to the stack 1
  2. Push 2 to the stack 21
  3. Push 3 to the stack 321
  4. By OP_ADD, add the first value and the second value of the stack and push the result to the stack 51
  5. By OP_SUB, subtract the first value and the second value of the stack and push the result to the stack 4

Therefore, the final result of the execution of this script is 4.

Transactions of Bitcoin have two scripts, scriptPubKey and scriptSig. For example, Bob wants to send 1 BTC which is given from Alice to Charlie. A transaction from Alice to Bob is Tx.1, also a transaction from Bob to Charlie is Tx.2. The requirement for accepting Tx.2 is as follows.

Run scriptSig of Tx.2, then taking over the stack, run scriptPubKey of Tx.1, Tx.2 will be accepted unless the final stack is 0.

image.png

Unless the result of running scriptSig and scriptPubKey (eval) is 0, the transaction Tx.2 accepts and will be added to the blockchain. We will consider that Bob sends 1 BTC which given from Alice as shown in the above figure. In addition, B is a public key that corresponds to Bob's Bitcoin address. scriptSig of Tx.2 contains Bob's signature S and his public key B as follows.

SB

After run this script, the stack are BS, and then run ScriptPubKey the transaction Tx.1. This scriptPubKey is as follows.

OP_DUPOP_HASH160hOP_EQUALVERIFYOP_CHECKSIG

The result of running this script under the stack BS is as follows.

  1. By OP_DUP, copy the top of the stack BBS
  2. By OP_HASH160, caliculate a hash value of the top of the stack and push the hash value to the top of the stack H(B)BS
  3. Push h to the stack hH(B)BS
  4. By OP_EQUALVERIFY, compare the first value and the second value of the stack. If they are not equal immediately become failure BS
  5. By OP_CHECKSIG, verify the signature at the second of the stack by the public key at the top of the stack. If the verification is successful and push 1 to the stack, if it fails push 0 1

Bitcoin miners will accept the transaction if the result of the scripts is 1, which means that h=H(B) and the signature S is verified by the public key B. Common Bitcoin transactions are done in this way.

Precondition

Before describing the fair ransomware protocol, I will talk about Alice and Bob. Bob had data t1,,tm, then they have been encrypted with a public key (e,N) by Alice's ransomware. Now Bob has c1,,cm(ci:=(ti)emodN). In addition, Bob knows the public key (e,N) that was used in encrypting1. Alice requests x BTC as ransom for decrypting the Bob's data. And Alice needs to think as follows.

She will decrypt at least one amang encrypted Bob's data c1,,cm for free.

Why is such a condition necessary? Alice was able to execute arbitrary programs like ransomware on Bob's computer. Using the protocol described in the later section, Bob can decrypt his ciphertext by sending Bitcoins to Alice. However, Bob cannot decide whether the ciphertext is the result of encrypting the data stored in his computer, or the data created by Alice using random numbers. So Alice needs to decrypt at least one data for free, prove that Bob's ciphertext is the encrypted data originally stored in his computer. From this condition, the following condition are also derived at the same time.

Bob can determine whether at least one data amang t1,,tm ware stored in his computer or not.

Since there are a lot of files on Bob's computer, the condition does not seem so unrealistic. Finally, A is a public key that corresponds to Alice's Bitcoin address, also B is a public key that corresponds to Bob's Bitcoin address.

Fair Ransomware Protocol

We will construct the fair ransomware protocol. This protocol consists of Phase 1, Phase 2 and Phase 3, and it is executed in this order. If any phase fails, the deal will be canceled immediately.

Phase 1

In this phase, Bob will make sure that Alice can correctly decrypt a ciphertext ci specified by him.

  1. Bob creates a random number r(1r<N), select a ciphertext ci which is able to identify that the content is correct and he caliculates s:=ciremodN then he sends s to Alice
  2. Alice caliculates ˉs:=sdmodN using the secret key d and sends ˉs to Bob
  3. Bob confirms ti=ˉs/rmodN

First, Bob blinds the ciphertext ci in (1). It means that Bob makes Alice not know what ciphertext he tries to decrypt. If Bob sends the ciphertext ci to Alice to decrypt, in (2) she can return Bob's data which she stole from his computer in advance. In order to prevent this, Bob needs to blind the ciphertext which he wants to decrypt. If Alice left the i number data which Bob cannot decrypt amang all date t1,,tm as her cheat, in this case, the probability of this to be successful is the number of cases that Bob chooses other than i number data, so it is as follows.

mim

Alice's cheat will be successful if i is small relative to m, but it means Bob can decrypt a lot of data.

Phase 2

In this phase Alice will prove that all ciphertexts that Bob has can be decrypted with the same secret key by the cut-and-choose protocol. First, a prover Alice who wants to prove the proposition and a verifier Bob who wants to verify the proposition have the following knowledge.

Prover Alice
Alice knows the private key d for the public key (e,N), also a ciphertext s:=Enck(cdmodN) by symmetric key encryptions. And she knows a symmetric key k.
Verifier Bob
Bob knows the public key (e,N) and the ciphertext c, the ciphertext s by symmetric key encryptions. However, he does not know the secret key d corresponding to the public key (e,N) and the symmetric key k.

Alice wants to prove the following.

A preimage k of a hash value H(k) equals to the symmetric key k for decrypting the ciphertext s.

Bob makes sure that the preimage of H(k) is definitely the symmetric key k for decrypting the ciphertext s without Bob knowing the private key d and the symmetric key k. And the result of decrypting the ciphertext s is the result of decrypting the ciphertext c. Alice and Bob will run the following cut-and-choose protocol for each ciphertext c1,,cm in total m times. For a ciphertext cj the cut-and-choose is the following.

  1. Bob creates n random numbers r1,,rn(1ri<N) and calculates σi:=(ri)emodN(i:=1,,n)
  2. Bob creates l random numbers ρ1,,ρl(1ρi<N) and calculates δi:=cj(ρi)emodN(i:=1,,l)
  3. Bob chooses a random permutation β:={β1,,βn+l} for {σ1,,σn,δ1,,δl} and sends β to Alice. In addition all σi of β is in F and the all δi of β is in R
  4. For i:=1,,n+l, Alice
  • creates a random number ki and a ciphertext si:=Encki((βi)dmodN)
  • calculates a hash value hi:=H(ki)
  1. Alice sends s1,,sn+l and h1,,hn+l to Bob
  2. Bob sends r1,,rn and F to Alice
  3. Alice confirms βi=(ri)emodN for all iF. If it fails, Alice will cancel because Bob cheated
  4. Alice sends ki for all iF to Bob
  5. For all iF, Bob makes sure that hi=H(ki) and ri=Decki(si). If it fails, Bob will cancel

Alice can prove that the ciphertext si is the result of decrypting ci that is encrypted by her ransomware with the symmetric key ki and ki is the preimage of the hash value hi with a high probability.

Phase 3

In this phase, Bob sends Bitcoin to Alice and Alice publishes the secret key at the same time.

  1. Bob makes a transaction Tx.1 with the following scriptPubKey to send x BTC to Alice and sends the transaction to Bitcoin miners

    OP_SHA256h1OP_EQUALOP_IFOP_SHA256h2OP_EQUALVERIFYOP_SHA256hlOP_EQUALVERIFYAOP_ELSEblock height+100OP_CHECKLOCKTIMEVERIFYOP_DROPBOP_ENDIFOP_CHECKSIG
  2. Alice checks the following of the transaction Tx.1 on the blockchain. If it fails, Alice will cancel
  • The remittance is x BTC
  • h1,,hl are hash values that Alice has sent to Bob in Phase 2
  • A is the public key that corresponds to her Bitcoin address
  1. Alice makes a transaction Tx.2 as follows. Its scriptSig contains the symmetric key ki(iR) and Alice's signature SA. She sends it to miners

    k1k2klSA
  2. If Alice's transaction Tx.2 is accepted, the following two will happen at the same time where (iR)
  • Bob gets ti=Decki(si)/modN using the symmetric key ki of the transaction Tx.2 on the blockchain
  • Alice gets x BTC since the transaction Tx.2 was accepted

First, we consider how scriptPubKey of the transaction Tx.1 works in (1). In (3) Alice creates the transaction Tx.2 which contains the symmetric key ki(iR) and Alice's signature SA. Bitcoin miners run scriptSig of the transaction Tx.2 so the stack after run the scriptSig of the transaction Tx.2 is as follows.

k1k2klSA

The transaction Tx.1 scriptPubKey is run as follows under this stack.

  1. By OP_SHA256, calculate a hash value of the top of the stack and push it to the stack H(k1)k2klSA
  2. Push h1 to the stack h1H(k1)k2klSA
  3. By OP_EQUAL, compare the first value and the second value of the stack. If they equal push 1 to the stack, otherwise push 0. 1k2klSA
  4. By OP_IF, if the top of the satck is 1, pop and run then-clause (from OP_IF to OP_ELSE) k2klSA
  5. By OP_EQUALVERIFY, hash values H(k2),,H(kl) equal to h2,,hl. If any one does not equal, it fails SA
  6. Push Alice's public key A to the stack ASA
  7. By OP_CHECKSIG, verify a signature on the top of the stack with the second value of the stack as a public key. If the verification will be successful push 1 to the stack, otherwise push 0 to the stack 1

In this way the scripts will be successful. In addition, by this scriptPubKey if miners fail the comparison of the hash value h1, they will run else-clause (from OP_ELSE to OP_ENDIF). By this else-clause, Bob can withdraw the x BTC after a while even if Alice disappears after (1). Bob may make a transaction Tx.3 after the length of the blockchain increases more than 100 than when he has send the transaction Tx.1. Bob sends Tx.3 that has the following scriptSig using Bob's signature SB to get back his Bitcoin.

1SB

When miners run the transaction Tx.3 scriptSig the stack is as follows.

1SB

The scriptPubKey of the transaction Tx.1 will be run under the stack in the following manner.

  1. By OP_SHA256, calculate a hash value of the top of the stack and push the hash value to the stack H(1)SB
  2. Push h1 to the stack h1H(1)SB
  3. By OP_EQUAL, compare the first value and the second value of the stack. If they are equal push 1 to the stack, otherwise push 0 to the stack 0SB
  4. By OP_IF, if the top of the stack is 0, pop and run the else-clause SB
  5. Push a sum of the length of the blockchain when Tx.1 was accepted and 100 to the stack block height+100SB
  6. By OP_CHECKLOCKTIMEVERIFY, the script will fail if the length of the current blockchain is less than the top of the stack, otherwise push 1 to the stack 1SB
  7. By OP_DROPD, drop the top of the stack SB
  8. Push Bob's public key B to the stack SB
  9. By OP_CHECKSIG, verify a signature on the top of the stack with the second value of the stack as a public key. If the verification will be successful push 1 to the stack, otherwise push 0 to the stack 1

In this way, Bob will be able to withdraw Bitcoins.

Conclusion

We will be able to create a fair ransomware using this protocol. If I am forced to say, Alice proves that a ciphertext is the result of encrypting Bob's data by decrypting at least one data for free. If Alice can prove that without decrypting at least one data, it will be considered to be a better protocol.

Thank you for reading this article. Your comments, problem reports and questions are very welcome!

Acknowledgment

Some members of CTF team urandom helped me to fix this article. Some members of DWANGO English Club pointed out mistakes in this article and gave me advices about Engilsh.

References

This article says Zero-Knowledge Contingent Payment (ZKCP) and its application. The ZKCP allows us to exchange a preimage of a hash value and Bitcoins at the same time. It is exactly what we used in the fair ransomeware protocol. This paper says about the fair mixing of Bitcoin using ZKCP. If you are interested in ZKCP, you try to read these articles.

Contacts


  1. Bob can get the public key (e,N) by reverse engineering Alice's ransomware.

コメント