blob: f20c190b6a5a35005bf69579e4e9f5bb550efa36 [file] [log] [blame] [view]
# SKY130 SHA3 Miner Caravel SOC
![miner](doc/sha3_256_miner.png)
## Table of Contents
* [Introduction](#introduction)
* [Implementation](#implementation)
* [Miner Component](#miner-component)
* [User block interface](#user-block-interface)
* [Register file](#register-file)
* [Status register](#status-register)
* [Control register](#control-register)
* [Verilog Module Hierarchy](#verilog-module-hierarchy)
* [Building](#building)
* [Prerequisites](#prerequisites)
* [Synthesizing](#Synthesizing)
* [Future](#future)
* [Picorv32 Firmware](#picorv32-firmware)
* [Acknowledgments](#acknowledgments)
## Introduction
In mining a proof-of-work (POW) is used to verify the authenticity of a blockchain entry.
What is a POW?
A POW is a mathematical puzzle which is difficul to solve but easy to verify.
For this example a SHA3 mining core is defined for a hypothetical blockchain that uses the
SHA3-256 hash.
Finding a hash that meets certain conditions is difficult, verifying it does
is simple.
The core is implemented on Skywater's SKY130 process curtesy of the
[Open MPW Shuttle Program] sponsored by Google.
We are given:
- H: 256 bit header (Fixed value input)
- N: 64 bit nonce (The value we must find)
- D: 256 bit difficulty (Problem difficulty, smaller value = more difficult)
The problem we need to solve is to find any value of N, such that the SHA3-256 hash of the nonce concatenated
to the header, is less than or equal to the difficulty:
D >= SHA3({H, N})
NOTE: This will not mine a real blockchain. It intended as an exaple of hashing algorithm optimized
for mining using a multi-stage permutation pipeline.
## Implementation
This ASIC is generated using end-to-end open source EDA tools. A 12 stage pipeline design is used in two phases since
a fully unrolled 24 stage pipeline exceeds the capacity of this ASIC. Each stage consists on an identical complex
combinatorial chain of 1600 inputs, 6 control inputs and 1600 outputs. Each stage must render its 1600 output values,
based on the inputs and altered according to the control inputs, within one clock cycle. Using 12 such linked stages
and an appropriate feedback path from the last to the first we can generate a single hash per clock cycle one half of
the time. The maximum clock
speed will be determined by the propagation delay of a single stage. The number of SHA3 hashes per second (hash rate in mining
parlance) that can be generated using this type of folded pipeline is given as follows:
F is the clock frequency in hertz
S the number of stages (must be a divisor of 24. i.e., 1, 2 4 6 12, 24)
H Hash rate
H = (F * S) / 24
A Wishbone client register file is implemented and serves for control and status by the Caravel picorv32 CPU core. This circuitry
is conveniently clocked by the Wishbone bus clock. The 12 combinatorial stages however are clocked from a separate user
programmable DLL clock, allowing hash rate adjustments. Proper synchronization is applied where timing domain crossing occurs.
In mining we do not really care what the winning hash is, we only care that it meets the difficulty requirement and what
nonce was used to achieve it. The nonce is a continuously incrementing counter so we simply freeze it when a match is found.
At a high level the chip is intended to function as a low level controller for the SHA3 pipeline, communicating via the
Caravel I2C or SPI ports to a larger computer to handle higher level functions such as Internet mining protocols.
### Miner Component
The component is an Wishbone bus device with a 23 word memory mapped register file for control and status. All user project
Verilog source is contained in the verilog/rtl/user_proj_example.v file.
#### User block interface.
| Signal | Width | Description |
| --- | --- | ---
| wb_clk_i | 1 | 50MHz Wishbone bus clock (1-bit input) |
| wb_rst_i | 1 | Asynchronous reset (1-bit input) |
| wbs_stb_i | 1 | Select |
| wbs_cyc_i | 1 | Active bus cycle |
| wbs_we_i | 1 | Write enable |
| wbs_sel_i | 4 | Byte lane select |
| wbs_dat_i | 32 | Input data |
| wbs_adr_i | 32 | Address |
| wbs_ack_o | 1 | Bus ccycle acknowledge |
| wbs_dat_o | 32 | Output data |
| io_in | MPRJ_IO_PADS | IO pin bus input |
| io_out | MPRJ_IO_PADS | IO pin bus output |
| io_oeb | MPRJ_IO_PADS | IO pin bus output enable |
| user_clock2 | 1 | 300 MHz miner core clock |
#### Register file
Each register occupies 4 bytes, starting at base address 0x30000000
| Reg. # | Name | Read/Write | Description|
| --- | --- | --- | --- |
| 0-1 | SOLN_REG | RO | 64-bit Solution |
| 2 | STATUS_REG | RO | Status (see below) |
| 3 | SHA3_REG | RO | Fingerprint "SHA3" |
| 4-11 | HDR_REG | RW | 256-bit Header |
| 12-19 | DIFF_REG | RW | 256-bit difficulty |
| 20-21 | START_REG | RW | 64-bit start nonce |
| 22 | CTL_REG | RW | Control (see below) |
#### Status register
| Bit # | Name | Description |
| --- | --- | --- |
| 0 | FOUND | Solution found. Solution is stored and status updated |
| 1 | RUNNING | The run ctl bit is set and the solution nonce is auto-incrementing |
| 2 | TESTING | The test ctl bit is set and compare diff equal (for verification only)|
#### Control register
| Bit # | Name | Description |
| --- | --- | --- |
| 0 | RUN | 0 - clear, 1 - auto increment the solution nonce and check hashes |
| 1 | TEST | 0 - normal mode, 1 - test mode, look for exact match with diff |
| 2 | HALT | 0 - normal mode, 1 - halt mining and update status |
| 23-16 | PAD_LAST | last pad byte, 0x80 for KECCACK-256 and SHA3-256 |
| 31-24 | PAD_FIRST | first pad byte, 0x01 for KECCACK-256, and 0x06 for SHA3-256 |
#### Verilog Module Hierarchy
```
Top module: \user_proj_example
Used module: \sha3_256_miner_core_12
Used module: \sha3_256_miner_round
Used module: \permutation
Used module: \sha3_256_miner_regs
```
## Building
### Prerequisites
[Caravel]
[Openlane]
[Magic]
[Google Skywater PDK]
### Synthesizing
From the command line:
```
git clone https://github.com/miscellaneousbits/caravel_sha3_256_crypto_miner.git
cd caravel_sha3_256_miner/openlane
make user_proj_example
make user_project_wrapper
cd ..
make ship
```
This will create the preliminary Skywater fab input artifacts. Essentially a giant GDS file
containing a full physical description of the system on a chip, in this case over 300,000
logic cells and 1,000,000 copper traces.
NOTE: The entire process takes about 10 hours on a high end PC with plenty of memory.
### Future
Currently this 12 stage pipeline design is fully autorouted and uses over 300,000 cells and is the
most that can be crammed into the available die space. The layout is very sparse in order to get
successful routing. It may be possible to optimize and harden smaller repeating blocks
the place then manually to achieve a fully unrolled 24 stage pipeline. This would achieve
twice the performance of the current version.
## Picorv32 Firmware
TBD
Long live open-everything.
[Caravel]: https://github.com/efabless/caravel.git
[Openlane]: https://github.com/efabless/openlane.git
[Google Skywater PDK]: https://github.com/google/skywater-pdk.git
[Open MPW Shuttle Program]: https://www.efabless.com/open_shuttle_program
[Magic]: http://opencircuitdesign.com/magic