Merge pull request #3 from efabless/master

[DATA] add the manifests for the precheck
diff --git a/README.md b/README.md
index 2c77d56..007caee 100644
--- a/README.md
+++ b/README.md
@@ -1,123 +1,127 @@
-# CIIC Harness  
+# Skywater 130 Decred Miner
 
-A template SoC for Google SKY130 free shuttles. It is still WIP. The current SoC architecture is given below.
+## Table of Contents
+* [Introduction](#introduction)
+* [Implementation](#implementation)
+	* [Hash Unit Input Data](#hash-unit-input-data)
+  * [ASIC Chaining Support](#asic-chaining-support)
+	* [Register File](#register-file)
+  * [Verilog Module Hierarchy](#verilog-module-hierarchy)
+* [Building](#building)
+  * [Check Out](#check-out)
+  * [Build Decred Flow](#build-decred-flow)
+  
+  
 
-<p align=”center”>
-<img src="/doc/ciic_harness.png" width="75%" height="75%"> 
-</p>
+## Introduction
 
+Decred is a blockchain-based cryptocurrency that utilizes a hybrid Proof-of-Work (PoW) and Proof-of-Stake (PoS) mining system. More about Decred can be found at https://docs.decred.org.
 
-## Getting Started:
+The PoW element of Decred uses the BLAKE-256 (14 round) hashing function and is described in more detail at https://docs.decred.org/research/blake-256-hash-function.
 
-* For information on tooling and versioning, please refer to [this][1].
+The Skywater 130 Decred Miner project implements a BLAKE-256r14 hash unit that is optimized for the Decred blockchain (i.e., not a generic BLAKE-256r14 hash unit). In addition to the hash unit, the core also includes a SPI unit with addressable register space and a device interrupt; all to be used with a separate controller board. The core is implemented on Skywater’s SKY130 process.
 
-Start by cloning the repo and uncompressing the files.
-```bash
-git clone https://github.com/efabless/caravel.git
-cd caravel
+Several Decred ASICs have been produced in the past at process nodes much smaller than 130nm (some as small as 16nm). This project’s purpose is not intended to compete with the performance per watt of those commercially available units. Rather, this project was intended as a method to learn about the challenges of ASIC development and provide a stepping stone for open-source ASIC development.
+
+## Implementation
+
+### Hash Unit Input Data
+
+The Decred blockchain provides a 180-byte header that includes common blockchain fields such as previous block hash, merkle root, timestamp, nonce, and height. It also includes Decred-specific fields such as voting information that works with the PoS portion of Decred. The Decred header specification can be found at https://devdocs.decred.org/developer-guides/block-header-specifications.
+
+The Decred PoW process runs variations of the header (plus 16-byte padding) through the BLAKE-256r14 hash function and compares that result to a numerical value (smaller value better). The varying data of the header is the nonce space. A Nonce field exists at the end of the Decred header. While the Nonce field is only 32-bits, the ExtraData field can be used to expand the nonce space. After the full header is initially hashed, only the last chunk of 64 bytes needs to be rehashed for each change in nonce space. This is because the Nonce and ExtraData fields are at the end of the header. The result of hashing the first 128 bytes of the header is referred to as the midstate. The controller board generates the header’s midstate and sends it, along with other static header data and the target difficulty information, to the core via the SPI interface. After the necessary data is sent, the controller board enables hashing. If the hash unit determines that a result suffices that target difficulty, an interrupt is generated from the core to the controller board and the solution nonce is saved. Once the interrupt is handled by the controller board, it reads the solution nonce from the core’s register space.
+
+Midstate – 256 bytes
+
+Static Header Data – 16 bytes
+
+Threshold Mask – 4 bytes
+
+Upper Nonce Start – 4 bytes
+
+Note that Decred’s minimum difficulty of 1.0 relates to a target that has 0 in the most significant 32-bits (i.e., 0x00000000 XXXXXXXX YYYYYYYY YYYYYYYY YYYYYYYY YYYYYYYY YYYYYYYY YYYYYYYY) so the Threshold Mask only populates the second most significant word (i.e., X’s). Based on the expected hash performance, Decred difficulties greater than 2^32 were impractical to support. This allowed for optimizations in the hash unit.
+
+### ASIC Chaining Support
+
+It is common for crypto currency mining machine manufacturers to chain several dozen ASIC chips together in a single unit to maximize hash rate SWaP (size, weight, and power). This project implements support for chaining ASICs to a single controller board.
+
+### Register File
+
+A small number of registers are provided at the register_bank level and accessed via the SPI interface.  Read/write operations can operate on different data (see R/W field).  A register window is used to interface with registers at the hash_macro level.
+```
+register_bank
+0x00  RW  Macro address
+0x01   W  Macro write data
+0x02  R   Macro interrupt status 
+0x02   W  Macro select (bit mapped)
+0x03  RW  Control byte
+        0 Macro read enable strobe
+	1 <unused>
+        2 Clk counter enable
+        3 LED output GPIO
+        4 M1 clk reset
+        5 Chain enable GPIO
+0x04  RW  SPI address [6:0]
+0x05  R   ID register
+0x05   W  Macro write stobe
+0x06  R   Macro ID register
+0x07  R   Perf counter [7:0]
+0x08  R   Perf counter [15:8]
+0x09  R   Perf counter [23:16]
+0x0A  R   Perf counter [31:24]
+0x80  R   Macro data
+
+hash_macro
+0x00 - 0x1F Midstate
+0x20 - 0x23 Threshold Mask
+0x24 - 0x33 Static Header Data
+0x34 - 0x37 Upper nonce start
+0x38 - 0x39 Nonce start
+0x3A - 0x3B Stride
+```
+### Verilog Module Hierarchy
+
+```
+decred_top.v
+   |
+    - clock_div.v
+   |
+    - decred.v
+         |
+          - addressalyzer.v
+         |
+          - spi_*.v
+         |
+          - register_bank.v
+               |
+                - hash_macro_nonblock.v
+```
+
+## Building
+Follow the steps at https://github.com/efabless/openlane#quick-start. 
+Note that as of the time of this writing, openlane mpw-one-a was the current release branch for the shuttle (i.e., git clone https://github.com/efabless/openlane.git --branch mpw-one-a).
+
+After ```make test``` succeeds, proceed to check out step next.
+
+### Check Out
+```
+cd openlane/designs
+git clone https://github.com/SweeperAA/caravel_skywater130_decred_miner.git
+cd caravel_skywater130_decred_miner
 make uncompress
 ```
 
-Then you need to install the open_pdks prerequisite:
- - [Magic VLSI Layout Tool](http://opencircuitdesign.com/magic/index.html) is needed to run open_pdks -- version >= 8.3.60*
+### Build Decred Flow
+Building to integrate into the caravel test harness chip is done in two steps.
 
- > \* Note: You can avoid the need for the magic prerequisite by using the openlane docker to do the installation step in open_pdks. This [file](https://github.com/efabless/openlane/blob/develop/travisCI/travisBuild.sh) shows how.
-
-Install the required version of the PDK by running the following commands:
-
-```bash
-export PDK_ROOT=<The place where you want to install the pdk>
-make pdk
+Step 1: Build the macro independent of the caravel chip.
+```
+cd caravel_skywater130_decred_miner/openlane
+make decred_top
 ```
 
-Then, you can learn more about the caravel chip by watching these video:
-- Caravel User Project Features -- https://youtu.be/zJhnmilXGPo
-- Aboard Caravel -- How to put your design on Caravel? -- https://youtu.be/9QV8SDelURk
-- Things to Clarify About Caravel -- What versions to use with Caravel? -- https://youtu.be/-LZ522mxXMw
-    - You could only use openlane:rc5
-    - Make sure you have the commit hashes provided here inside the [Makefile](./Makefile)
-## Aboard Caravel:
-
-Your area is the full user_project_wrapper, so feel free to add your project there or create a differnt macro and harden it seperately then insert it into the user_project_wrapper. For example, if your design is analog or you're using a different tool other than OpenLANE.
-
-If you will use OpenLANE to harden your design, go through the instructions in this [README.md][0].
-
-Then, you will need to put your design aboard the Caravel chip. Make sure you have the following:
-
-- [Magic VLSI Layout Tool](http://opencircuitdesign.com/magic/index.html) installed on your machine. We may provide a Dockerized version later.\*
-- You have your user_project_wrapper.gds under `./gds/` in the Caravel directory.
-
- > \* **Note:** You can avoid the need for the magic prerequisite by using the openlane docker to run the make step. This [section](#running-make-using-openlane-magic) shows how.
-
-Run the following command:
-
-```bash
-export PDK_ROOT=<The place where the installed pdk resides. The same PDK_ROOT used in the pdk installation step>
-make
+Step 2: Integrate macro into caravel user space.
+```
+make user_project_wrapper
 ```
 
-This should merge the GDSes using magic and you'll end up with your version of `./gds/caravel.gds`. You should expect hundred of thousands of magic DRC violations with the current "development" state of caravel.
-
-## Running Make using OpenLANE Magic
-
-To use the magic installed inside Openlane to complete the final GDS streaming out step, export the following:
-
-```bash
-export PDK_ROOT=<The location where the pdk is installed>
-export OPENLANE_ROOT=<the absolute path to the openlane directory cloned or to be cloned>
-export IMAGE_NAME=<the openlane image name installed on your machine. Preferably openlane:rc5>
-export CARAVEL_PATH=$(pwd)
-```
-
-Then, mount the docker:
-
-```bash
-docker run -it -v $CARAVEL_PATH:$CARAVEL_PATH -v $OPENLANE_ROOT:/openLANE_flow -v $PDK_ROOT:$PDK_ROOT -e CARAVEL_PATH=$CARAVEL_PATH -e PDK_ROOT=$PDK_ROOT -u $(id -u $USER):$(id -g $USER) $IMAGE_NAME
-```
-
-Finally, once inside the docker run the following commands:
-```bash
-cd $CARAVEL_PATH
-make
-exit
-```
-
-This should merge the GDSes using magic and you'll end up with your version of `./gds/caravel.gds`. You should expect hundred of thousands of magic DRC violations with the current "development" state of caravel.
-
-## Required Directory Structure
-
-- ./gds/ : includes all the gds files used or produced from the project.
-- ./def/ : includes all the def files used or produced from the project.
-- ./lef/ : includes all the lef files used or produced from the project.
-- ./mag/ : includes all the mag files used or produced from the project.
-- ./maglef/ : includes all the maglef files used or produced from the project.
-- ./spi/lvs/ : includes all the maglef files used or produced from the project.
-- ./verilog/dv/ : includes all the simulation test benches and how to run them. 
-- ./verilog/gl/ : includes all the synthesized/elaborated netlists. 
-- ./verilog/rtl/ : includes all the Verilog RTLs and source files.
-- ./openlane/`<macro>`/ : includes all configuration files used to run openlane on your project.
-- info.yaml: includes all the info required in [this example](info.yaml). Please make sure that you are pointing to an elaborated caravel netlist as well as a synthesized gate-level-netlist for the user_project_wrapper
-
-## Managment SoC
-The managment SoC runs firmware that can be used to:
-- Configure User Project I/O pads
-- Observe and control User Project signals (through on-chip logic analyzer probes)
-- Control the User Project power supply
-
-The memory map of the management SoC can be found [here](verilog/rtl/README)
-
-## User Project Area
-This is the user space. It has limited silicon area (TBD, about 3.1mm x 3.8mm) as well as a fixed number of I/O pads (37) and power pads (10).  See [the Caravel  premliminary datasheet](doc/caravel_datasheet.pdf) for details.
-The repository contains a [sample user project](/verilog/rtl/user_proj_example.v) that contains a binary 32-bit up counter.  </br>
-
-<p align=”center”>
-<img src="/doc/counter_32.png" width="50%" height="50%">
-</p>
-
-The firmware running on the Management Area SoC, configures the I/O pads used by the counter and uses the logic probes to observe/control the counter. Three firmware examples are provided:
-1. Configure the User Project I/O pads as o/p. Observe the counter value in the testbench: [IO_Ports Test](verilog/dv/caravel/user_proj_example/io_ports).
-2. Configure the User Project I/O pads as o/p. Use the Chip LA to load the counter and observe the o/p till it reaches 500: [LA_Test1](verilog/dv/caravel/user_proj_example/la_test1).
-3. Configure the User Project I/O pads as o/p. Use the Chip LA to control the clock source and reset signals and observe the counter value for five clock cylcles:  [LA_Test2](verilog/dv/caravel/user_proj_example/la_test2).
-
-[0]: openlane/README.md
-[1]: mpw-one-a.md
diff --git a/info.yaml b/info.yaml
index 685ee5e..1c36a51 100644
--- a/info.yaml
+++ b/info.yaml
@@ -2,16 +2,16 @@
 project: 
   description: "A template SoC for Google sponsored Open MPW shuttles for SKY130."
   foundry: "SkyWater"
-  git_url: "https://github.com/efabless/caravel.git"
-  organization: "Efabless"
-  organization_url: "http://efabless.com"
-  owner: "Tim Edwards"
+  git_url: "https://github.com/SweeperAA/caravel_skywater130_decred_miner.git"
+  organization: ""
+  organization_url: ""
+  owner: "Matt Aamold, James Aamold"
   process: "SKY130"
-  project_name: "Caravel"
+  project_name: "Skywater 130 Decred Miner"
   tags: 
     - "Open MPW"
-    - "Test Harness"
-  category: "Test Harness"
+    - "Decred"
+  category: "Decred"
   top_level_netlist: "verilog/gl/caravel.v"
   user_level_netlist: "verilog/gl/user_project_wrapper.v"
   version: "1.00"
diff --git a/openlane/decred_top/config.tcl b/openlane/decred_top/config.tcl
new file mode 100644
index 0000000..c00f6be
--- /dev/null
+++ b/openlane/decred_top/config.tcl
@@ -0,0 +1,51 @@
+# Design
+set ::env(DESIGN_NAME) "decred_top"
+
+set script_dir [file dirname [file normalize [info script]]]
+
+set ::env(VERILOG_FILES) "\
+   $script_dir/../../verilog/rtl/defines.v \
+   $script_dir/../../verilog/rtl/decred_top/rtl/src/decred_defines.v \
+   $script_dir/../../verilog/rtl/decred_top/rtl/src/decred_top.v \
+   $script_dir/../../verilog/rtl/decred_top/rtl/src/addressalyzer.v \
+   $script_dir/../../verilog/rtl/decred_top/rtl/src/clock_div.v \
+   $script_dir/../../verilog/rtl/decred_top/rtl/src/decred.v \
+   $script_dir/../../verilog/rtl/decred_top/rtl/src/hash_macro_nonblock.v \
+   $script_dir/../../verilog/rtl/decred_top/rtl/src/register_bank.v \
+   $script_dir/../../verilog/rtl/decred_top/rtl/src/spi_passthrough.v \
+   $script_dir/../../verilog/rtl/decred_top/rtl/src/spi_des.v"
+
+set ::env(BASE_SDC_FILE) "$script_dir/decred_top.sdc"
+
+set ::env(CLOCK_PORT) "M1_CLK_IN PLL_INPUT S1_CLK_IN"
+set ::env(CLOCK_NET) "clock_divBlock.even_0.clk decred_macro.SPI_CLK"
+
+set ::env(DESIGN_IS_CORE) 0
+
+set ::env(FP_SIZING) absolute
+set ::env(DIE_AREA) "0 0 2800 3400"
+
+# 2 hash units
+set ::env(CLOCK_PERIOD) "15.000"
+#default is 50
+set ::env(FP_CORE_UTIL) "50"
+set ::env(PL_TARGET_DENSITY) 0.55
+set ::env(SYNTH_STRATEGY) "1"
+set ::env(CELL_PAD) "4"
+#default is 0.15
+set ::env(GLB_RT_ADJUSTMENT) "0.15"
+#default is 3
+set ::env(DIODE_INSERTION_STRATEGY) "3"
+set ::env(GLB_RT_MAX_DIODE_INS_ITERS) 10
+# default is 5
+set ::env(SYNTH_MAX_FANOUT) "5"
+#default is 1
+set ::env(FP_ASPECT_RATIO) "1"
+#default is 0
+set ::env(FP_PDN_CORE_RING) 0
+#default is 6
+set ::env(GLB_RT_MAXLAYER) 5
+#default is 0
+set ::env(PL_BASIC_PLACEMENT) 0
+
+set ::env(ROUTING_CORES) 8
diff --git a/openlane/decred_top/decred_top.sdc b/openlane/decred_top/decred_top.sdc
new file mode 100644
index 0000000..649f4ae
--- /dev/null
+++ b/openlane/decred_top/decred_top.sdc
@@ -0,0 +1,31 @@
+create_clock [get_ports M1_CLK_IN]  -name M1_CLK_IN  -period 15
+create_clock [get_ports PLL_INPUT]  -name PLL_INPUT  -period 15
+create_clock [get_ports S1_CLK_IN]  -name S1_CLK_IN  -period 100
+set_clock_groups -asynchronous \
+   -group [get_clocks {M1_CLK_IN}] \
+   -group [get_clocks {PLL_INPUT}] \
+   -group [get_clocks {S1_CLK_IN}] 
+
+set input_delay_value [expr $::env(CLOCK_PERIOD) * $::env(IO_PCT)]
+set output_delay_value [expr $::env(CLOCK_PERIOD) * $::env(IO_PCT)]
+puts "\[INFO\]: Setting output delay to: $output_delay_value"
+puts "\[INFO\]: Setting input delay to: $input_delay_value"
+
+
+set clk_indx [lsearch [all_inputs] [get_port $::env(CLOCK_PORT)]]
+#set rst_indx [lsearch [all_inputs] [get_port resetn]]
+set all_inputs_wo_clk [lreplace [all_inputs] $clk_indx $clk_indx]
+#set all_inputs_wo_clk_rst [lreplace $all_inputs_wo_clk $rst_indx $rst_indx]
+set all_inputs_wo_clk_rst $all_inputs_wo_clk
+
+
+# correct resetn
+set_input_delay $input_delay_value  -clock [get_clocks $::env(CLOCK_PORT)] $all_inputs_wo_clk_rst
+#set_input_delay 0.0 -clock [get_clocks $::env(CLOCK_PORT)] {resetn}
+set_output_delay $output_delay_value  -clock [get_clocks $::env(CLOCK_PORT)] [all_outputs]
+
+# TODO set this as parameter
+set_driving_cell -lib_cell $::env(SYNTH_DRIVING_CELL) -pin $::env(SYNTH_DRIVING_CELL_PIN) [all_inputs]
+set cap_load [expr $::env(SYNTH_CAP_LOAD) / 1000.0]
+puts "\[INFO\]: Setting load to: $cap_load"
+set_load  $cap_load [all_outputs]
diff --git a/openlane/user_project_wrapper/config.tcl b/openlane/user_project_wrapper/config.tcl
deleted file mode 120000
index d4a8f25..0000000
--- a/openlane/user_project_wrapper/config.tcl
+++ /dev/null
@@ -1 +0,0 @@
-../user_project_wrapper_empty/config.tcl
\ No newline at end of file
diff --git a/openlane/user_project_wrapper/config.tcl b/openlane/user_project_wrapper/config.tcl
new file mode 100644
index 0000000..04a12f7
--- /dev/null
+++ b/openlane/user_project_wrapper/config.tcl
@@ -0,0 +1,46 @@
+set script_dir [file dirname [file normalize [info script]]]
+
+set ::env(DESIGN_NAME) user_project_wrapper
+set ::env(FP_PIN_ORDER_CFG) $script_dir/pin_order.cfg
+
+set ::env(PDN_CFG) $script_dir/pdn.tcl
+set ::env(FP_PDN_CORE_RING) 1
+set ::env(FP_SIZING) absolute
+set ::env(DIE_AREA) "0 0 2920 3520"
+
+set ::unit 2.4
+set ::env(FP_IO_VEXTEND) [expr 2*$::unit]
+set ::env(FP_IO_HEXTEND) [expr 2*$::unit]
+set ::env(FP_IO_VLENGTH) $::unit
+set ::env(FP_IO_HLENGTH) $::unit
+
+set ::env(FP_IO_VTHICKNESS_MULT) 4
+set ::env(FP_IO_HTHICKNESS_MULT) 4
+
+
+set ::env(CLOCK_PORT) "user_clock2"
+#set ::env(CLOCK_NET) "mprj.clk"
+
+set ::env(CLOCK_PERIOD) "15"
+
+set ::env(PL_OPENPHYSYN_OPTIMIZATIONS) 0
+set ::env(DIODE_INSERTION_STRATEGY) 0
+
+# Need to fix a FastRoute bug for this to work, but it's good
+# for a sense of "isolation"
+set ::env(MAGIC_ZEROIZE_ORIGIN) 0
+set ::env(MAGIC_WRITE_FULL_LEF) 1
+
+set ::env(VERILOG_FILES) "\
+	$script_dir/../../verilog/rtl/defines.v \
+	$script_dir/../../verilog/rtl/user_project_wrapper.v"
+
+set ::env(VERILOG_FILES_BLACKBOX) "\
+	$script_dir/../../verilog/rtl/defines.v \
+	$script_dir/../../verilog/rtl/decred_top/rtl/src/*.v"
+
+set ::env(EXTRA_LEFS) "\
+	$script_dir/../../lef/decred_top.lef"
+
+set ::env(EXTRA_GDS_FILES) "\
+	$script_dir/../../gds/decred_top.gds"
diff --git a/openlane/user_project_wrapper/interactive.tcl b/openlane/user_project_wrapper/interactive.tcl
index 394f62b..81e865a 100644
--- a/openlane/user_project_wrapper/interactive.tcl
+++ b/openlane/user_project_wrapper/interactive.tcl
@@ -14,7 +14,8 @@
 
 apply_def_template
 
-add_macro_placement mprj 1150 1700 N
+#add_macro_placement mprj 1150 1700 N
+add_macro_placement mprj 0 0 N
 
 manual_macro_placement f
 
diff --git a/verilog/rtl/decred_top/rtl/src/addressalyzer.v b/verilog/rtl/decred_top/rtl/src/addressalyzer.v
new file mode 100755
index 0000000..e6c5f5f
--- /dev/null
+++ b/verilog/rtl/decred_top/rtl/src/addressalyzer.v
@@ -0,0 +1,219 @@
+// Copyright 2020 Matt Aamold, James Aamold
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+//     http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+// Language: Verilog 2001
+
+`timescale 1ns / 1ps
+
+module addressalyzer (
+  input  wire  RST,
+  input  wire  SPI_CLK,
+
+  input        start_of_transfer,
+  input        end_of_transfer,
+  input [7:0]  data_in_value,
+  input        data_in_ready,
+  input        data_out_request,
+  input        write_enable_mask,
+
+  output wire [14:0] ram_address_out,
+  output reg   address_strobe,
+  output       ram_read_strobe,
+  output       ram_write_strobe
+  );
+
+  wire read_cycle;
+
+  // //////////////////////////////////////////////////////
+  // Address FSM signals 
+  parameter ADDR_SIZE     = 6;
+  parameter ADDR_IDLE     = 6'b000001;
+  parameter ADDR_ADDR1    = 6'b000010;
+  parameter ADDR_ADDR2    = 6'b000100;  
+  parameter ADDR_RD_BYTES = 6'b001000;  
+  parameter ADDR_WR_BYTEQ = 6'b010000;  
+  parameter ADDR_WR_BYTES = 6'b100000;  
+
+  reg [ADDR_SIZE - 1:0]  addr_state;
+
+  reg [15:0] address_local;
+  assign read_cycle = address_local[15];
+  assign ram_address_out = address_local[14:0];
+
+  always @ (posedge SPI_CLK) begin 
+    if(RST) begin 
+      address_local  <= 0;
+      address_strobe <= 0;
+      addr_state     <= ADDR_IDLE;
+    end
+    else begin
+
+      case (addr_state)
+
+        ADDR_IDLE:
+		    begin
+          address_local  <= 0;
+          address_strobe <= 0;
+	  	    if (start_of_transfer == 1'b1) begin
+            addr_state <= ADDR_ADDR1;
+          end
+		    end
+
+        ADDR_ADDR1:
+  		  if (data_in_ready == 1'b1) begin
+
+          address_local <= {data_in_value, data_in_value};
+          addr_state <= ADDR_ADDR2;
+
+        end
+
+        ADDR_ADDR2:
+		    if (data_in_ready == 1'b1) begin
+
+          address_local  <= {address_local[15:8], data_in_value};
+          address_strobe <= 1;
+
+          if (read_cycle == 1'b1) begin
+
+            addr_state <= ADDR_RD_BYTES;
+
+          end else begin
+
+            addr_state <= ADDR_WR_BYTEQ;
+ 
+          end
+        end 
+
+        ADDR_RD_BYTES:
+		    if (data_out_request == 1'b1) begin
+
+          address_local  <= address_local + 1'b1;
+          address_strobe <= 0;
+
+        end else if (end_of_transfer == 1'b1) begin
+
+          addr_state     <= ADDR_IDLE;
+          address_strobe <= 0;
+
+        end
+
+        ADDR_WR_BYTEQ:
+		    if (data_in_ready == 1'b1) begin
+
+          addr_state     <= ADDR_WR_BYTES;
+          address_strobe <= 0;
+
+        end else if (end_of_transfer == 1'b1) begin
+
+          addr_state <= ADDR_IDLE;
+          address_strobe <= 0;
+
+        end
+
+        ADDR_WR_BYTES:
+	      if (data_in_ready == 1'b1) begin
+
+          address_local <= address_local + 1'b1;
+
+        end else if (end_of_transfer == 1'b1) begin
+
+          addr_state <= ADDR_IDLE;
+
+        end
+
+        default: begin
+          addr_state <= ADDR_IDLE;
+        end
+      endcase
+    end
+  end
+
+  // //////////////////////////////////////////////////////
+  // Read/Write FSM signals 
+  parameter RDWR_SIZE       = 4;
+  parameter RDWR_IDLE       = 4'b0001;
+  parameter RDWR_CLK_EN     = 4'b0010;
+  parameter RDWR_STROBE0    = 4'b0100;  
+  parameter RDWR_END        = 4'b1000;  
+
+  reg [RDWR_SIZE - 1:0]  rdwr_state;
+  reg rdwr_read_en;
+  reg rdwr_write_en;
+
+  assign ram_read_strobe = rdwr_read_en;
+  assign ram_write_strobe = rdwr_write_en;
+
+  always @ (posedge SPI_CLK) begin 
+    if(RST) begin 
+      rdwr_state    <= RDWR_IDLE;
+      rdwr_read_en  <= 0;
+      rdwr_write_en <= 0;
+    end
+    else begin
+
+    case (rdwr_state)
+
+      RDWR_IDLE:
+      if (addr_state == ADDR_WR_BYTES) begin
+
+        rdwr_read_en     <= 0;
+        rdwr_write_en    <= write_enable_mask;
+        rdwr_state       <= RDWR_CLK_EN;
+
+      end else if (addr_state == ADDR_RD_BYTES) begin
+
+        rdwr_read_en     <= 0;
+        rdwr_write_en    <= 0;
+        rdwr_state       <= RDWR_CLK_EN;
+		  end
+
+      RDWR_CLK_EN:
+      if (addr_state == ADDR_WR_BYTES) begin
+
+        rdwr_read_en     <= 0;
+        rdwr_write_en    <= 0;
+        rdwr_state       <= RDWR_STROBE0;
+
+      end else if (addr_state == ADDR_RD_BYTES) begin
+
+        rdwr_read_en     <= 1;
+        rdwr_write_en    <= 0;
+        rdwr_state       <= RDWR_STROBE0;
+		  end
+
+      RDWR_STROBE0:
+      begin
+
+        rdwr_read_en     <= 0;
+        rdwr_write_en    <= 0;
+        rdwr_state       <= RDWR_END;
+      end
+
+      RDWR_END:
+      begin
+
+        if (data_in_ready == 1'b1) begin
+
+          rdwr_state <= RDWR_IDLE;
+		    end
+      end
+
+      default:
+      begin
+        rdwr_state <= RDWR_IDLE;
+      end
+      endcase
+     end
+   end
+endmodule // addressalyzer
diff --git a/verilog/rtl/decred_top/rtl/src/clock_div.v b/verilog/rtl/decred_top/rtl/src/clock_div.v
new file mode 100755
index 0000000..d4f197f
--- /dev/null
+++ b/verilog/rtl/decred_top/rtl/src/clock_div.v
@@ -0,0 +1,215 @@
+// Copyright 2020 Matt Aamold, James Aamold
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+//     http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+// Language: Verilog 2001
+
+`timescale 1ns / 1ps
+
+/* Integer-N clock divider */
+`default_nettype none
+ 
+module clock_div #(
+    parameter SIZE = 3		// Number of bits for the divider value
+) (
+    in, out, N, resetb
+);
+    input in;			// input clock
+    input [SIZE-1:0] N;		// the number to be divided by
+    input resetb;		// asynchronous reset (sense negative)
+    output out;			// divided output clock
+ 
+    wire out_odd;		// output of odd divider
+    wire out_even;		// output of even divider
+    wire not_zero;		// signal to find divide by 0 case
+    wire enable_even;		// enable of even divider
+    wire enable_odd;		// enable of odd divider
+
+    reg [SIZE-1:0] syncN;	// N synchronized to output clock
+    reg [SIZE-1:0] syncNp;	// N synchronized to output clock
+ 
+    assign not_zero = | syncN[SIZE-1:1];
+ 
+    assign out = (out_odd & syncN[0] & not_zero) | (out_even & !syncN[0]);
+    assign enable_odd = syncN[0] & not_zero;
+    assign enable_even = !syncN[0];
+
+    // Divider value synchronization (double-synchronized to avoid metastability)
+    always @(posedge out) begin
+	if (resetb == 1'b0) begin
+	    syncN <= 'd2;	// Default to divide-by-2 on system reset
+	    syncNp <= 'd2;	// Default to divide-by-2 on system reset
+	end else begin
+	    syncNp <= N;
+	    syncN <= syncNp;
+	end
+    end
+ 
+    // Even divider
+    even even_0(in, out_even, syncN, resetb, not_zero, enable_even);
+    // Odd divider
+    odd odd_0(in, out_odd, syncN, resetb, enable_odd);
+ 
+endmodule // clock_div
+ 
+/* Odd divider */
+
+module odd #(
+    parameter SIZE = 3
+) (
+    clk, out, N, resetb, enable
+);
+    input clk;			// slow clock
+    output out;			// fast output clock
+    input [SIZE-1:0] N;		// division factor
+    input resetb;		// synchronous reset
+    input enable;		// odd enable
+ 
+    reg [SIZE-1:0] counter;	// these 2 counters are used
+    reg [SIZE-1:0] counter2;	// to non-overlapping signals
+    reg out_counter;		// positive edge triggered counter
+    reg out_counter2;		// negative edge triggered counter
+    reg rst_pulse;		// pulse generated when vector N changes
+    reg [SIZE-1:0] old_N;	// gets set to old N when N is changed
+    wire not_zero;		// if !not_zero, we devide by 1
+ 
+    // xor to generate 50% duty, half-period waves of final output
+    assign out = out_counter2 ^ out_counter;
+
+    // positive edge counter/divider
+    always @(posedge clk) begin
+	if (resetb == 1'b0) begin
+	    counter <= N;
+	    out_counter <= 1;
+	end else if (rst_pulse) begin
+	    counter <= N;
+	    out_counter <= 1;
+	end else if (enable) begin
+	    if (counter == 1) begin
+		counter <= N;
+		out_counter <= ~out_counter;
+	    end else begin
+		counter <= counter - 1'b1;
+	    end
+	end
+    end
+ 
+    reg [SIZE-1:0] initial_begin;	// this is used to offset the negative edge counter
+    wire [SIZE:0] interm_3;		// from the positive edge counter in order to
+    assign interm_3 = {1'b0,N} + 2'b11;	// guarante 50% duty cycle.
+ 
+    // Counter driven by negative edge of clock.
+
+    always @(negedge clk) begin
+	if (resetb == 1'b0) begin
+	    // reset the counter at system reset
+	    counter2 <= N;
+	    initial_begin <= interm_3[SIZE:1];
+	    out_counter2 <= 1;
+	end else if (rst_pulse) begin
+	    // reset the counter at change of N.
+	    counter2 <= N;
+	    initial_begin <= interm_3[SIZE:1];
+	    out_counter2 <= 1;
+	end else if ((initial_begin <= 1) && enable) begin
+
+	    // Do normal logic after odd calibration.
+	    // This is the same as the even counter.
+	    if (counter2 == 1) begin
+		counter2 <= N;
+		out_counter2 <= ~out_counter2;
+	    end else begin
+		counter2 <= counter2 - 1'b1;
+	    end
+	end else if (enable) begin
+	    initial_begin <= initial_begin - 1'b1;
+	end
+    end
+ 
+    //
+    // reset pulse generator:
+    //               __    __    __    __    _
+    // clk:       __/  \__/  \__/  \__/  \__/
+    //            _ __________________________
+    // N:         _X__________________________
+    //               _____
+    // rst_pulse: __/     \___________________
+    //
+    // This block generates an internal reset for the odd divider in the
+    // form of a single pulse signal when the odd divider is enabled.
+
+    always @(posedge clk) begin
+	if (resetb == 1'b0) begin
+	    rst_pulse <= 0;
+	end else if (enable) begin
+	    if (N != old_N) begin
+		// pulse when reset changes
+		rst_pulse <= 1;
+	    end else begin
+		rst_pulse <= 0;
+	    end
+	end
+    end
+ 
+    always @(posedge clk) begin
+	// always save the old N value to guarante reset from
+	// an even-to-odd transition.
+	old_N <= N;
+    end	
+ 
+endmodule // odd
+
+/* Even divider */
+
+module even #(
+    parameter SIZE = 3
+) (
+    clk, out, N, resetb, not_zero, enable
+);
+    input clk;		// fast input clock
+    output out;		// slower divided clock
+    input [SIZE-1:0] N;	// divide by factor 'N'
+    input resetb;	// asynchronous reset
+    input not_zero;	// if !not_zero divide by 1
+    input enable;	// enable the even divider
+ 
+    reg [SIZE-1:0] counter;
+    reg out_counter;
+    wire [SIZE-1:0] div_2;
+ 
+    // if N=0 just output the clock, otherwise, divide it.
+    assign out = (clk & !not_zero) | (out_counter & not_zero);
+    assign div_2 = {1'b0, N[SIZE-1:1]};
+ 
+    // simple flip-flop even divider
+    always @(posedge clk) begin
+	if (resetb == 1'b0) begin
+	    counter <= 1;
+	    out_counter <= 1;
+
+	end else if (enable) begin
+	    // only use switching power if enabled
+	    if (counter == 1) begin
+		// divide after counter has reached bottom
+		// of interval 'N' which will be value '1'
+		counter <= div_2;
+		out_counter <= ~out_counter;
+	    end else begin
+		// decrement the counter and wait
+		counter <= counter-1;	// to start next transition.
+	    end
+	end
+    end
+ 
+endmodule //even
+`default_nettype wire
diff --git a/verilog/rtl/decred_top/rtl/src/decred.v b/verilog/rtl/decred_top/rtl/src/decred.v
new file mode 100755
index 0000000..c923144
--- /dev/null
+++ b/verilog/rtl/decred_top/rtl/src/decred.v
@@ -0,0 +1,172 @@
+// Copyright 2020 Matt Aamold, James Aamold
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+//     http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+// Language: Verilog 2001
+
+`timescale 1ns / 1ps
+`include "decred_defines.v"
+
+module decred (
+  input  wire  EXT_RESET_N_fromHost,
+  input  wire  SCLK_fromHost,
+  input  wire  M1_CLK,
+  input  wire  SPI_CLK,
+  input  wire  SCSN_fromHost,
+  input  wire  MOSI_fromHost,
+  input  wire  MISO_fromClient,
+  input  wire  IRQ_OUT_fromClient,
+  input  wire  ID_fromClient,
+
+  output wire  SCSN_toClient,
+  output wire  SCLK_toClient,
+  output wire  MOSI_toClient,
+  output wire  EXT_RESET_N_toClient,
+  output wire  ID_toHost,
+
+  output wire  CLK_LED,
+  output wire  MISO_toHost,
+  output wire  HASH_LED,
+  output wire  IRQ_OUT_toHost,
+  output wire  hash_clock_reset
+  );
+
+  // //////////////////////////////////////////////////////
+  // Pass-through wires
+  wire rst_local;
+  wire sclk_local;
+  wire scsn_local;
+  wire mosi_local;
+  wire miso_local;
+  wire irq_local;
+  wire address_stobe;
+  wire write_enable;
+  wire [6:0] setSPIAddr;
+
+  // //////////////////////////////////////////////////////
+  // Heartbeat output
+
+  reg [23:1] counter;
+  
+  always @(posedge M1_CLK)
+    if (rst_local) 
+	    counter <= 0;
+	  else
+	    counter <= counter + 1'b1;
+
+  assign CLK_LED = counter[23];
+
+  // //////////////////////////////////////////////////////
+  // SPI deserializer
+
+  wire       start_of_transfer;
+  wire       end_of_transfer;
+  wire [7:0] mosi_data_out;
+  wire       mosi_data_ready;
+  wire       miso_data_request;
+  wire [7:0] miso_data_in;
+
+  spi spiBlock(
+    .SPI_CLK(SPI_CLK),
+    .RST(rst_local),
+    .SCLK(sclk_local),
+    .SCSN(scsn_local),
+    .MOSI(mosi_local),
+
+    .start_of_transfer(start_of_transfer),
+    .end_of_transfer(end_of_transfer),
+	  .mosi_data_out(mosi_data_out),
+    .mosi_data_ready(mosi_data_ready),
+    .MISO(miso_local),
+    .miso_data_request(miso_data_request),
+    .miso_data_in(miso_data_in)
+  );
+
+  // //////////////////////////////////////////////////////
+  // SPI pass through
+
+  spi_passthrough spiPassBlock(
+    .SPI_CLK(SPI_CLK),
+    .RSTin(EXT_RESET_N_fromHost),
+    .ID_in(ID_fromClient),
+    .IRQ_in(IRQ_OUT_fromClient),
+    .address_strobe(address_stobe),
+    .currentSPIAddr(address[14:8]),
+    .setSPIAddr(setSPIAddr),
+
+    .SCLKin(SCLK_fromHost),
+    .SCSNin(SCSN_fromHost),
+    .MOSIin(MOSI_fromHost),
+    .MISOout(MISO_toHost),
+
+    .rst_local(rst_local),
+    .sclk_local(sclk_local),
+    .scsn_local(scsn_local),
+    .mosi_local(mosi_local),
+    .miso_local(miso_local),
+    .irq_local(irq_local),
+    .write_enable(write_enable),
+
+    .RSTout(EXT_RESET_N_toClient),
+    .SCLKout(SCLK_toClient),
+    .SCSNout(SCSN_toClient),
+    .MOSIout(MOSI_toClient),
+    .MISOin(MISO_fromClient),
+    .IRQout(IRQ_OUT_toHost)
+  );
+
+  // //////////////////////////////////////////////////////
+  // Interface to addressalyzer
+
+  wire [14:0] address;
+  wire        regFile_read_strobe;
+  wire        regFile_write_strobe;
+
+  addressalyzer addressalyzerBlock (
+    .SPI_CLK(SPI_CLK),
+    .RST(rst_local),
+
+    .start_of_transfer(start_of_transfer),
+    .end_of_transfer(end_of_transfer),
+    .data_in_value(mosi_data_out),
+    .data_in_ready(mosi_data_ready),
+    .data_out_request(miso_data_request),
+    .write_enable_mask(write_enable),
+
+    .ram_address_out(address),
+    .address_strobe(address_stobe),
+    .ram_read_strobe(regFile_read_strobe),
+    .ram_write_strobe(regFile_write_strobe)
+  );
+
+  // //////////////////////////////////////////////////////
+  // Interface to regfile
+
+  regBank #(.NUM_OF_MACROS(`NUMBER_OF_MACROS))
+  regBankBlock (
+    .SPI_CLK(SPI_CLK),
+    .RST(rst_local),
+    .M1_CLK(M1_CLK),
+    .address(address[7:0]),
+    .data_in(mosi_data_out),
+    .read_strobe(regFile_read_strobe),
+    .write_strobe(regFile_write_strobe),
+    .hash_clock_reset(hash_clock_reset),
+    .data_out(miso_data_in),
+    .LED_out(HASH_LED),
+    .spi_addr(setSPIAddr),
+    .ID_out(ID_toHost),
+    .interrupt_out(irq_local)
+  );
+
+endmodule // decred
diff --git a/verilog/rtl/decred_top/rtl/src/decred_defines.v b/verilog/rtl/decred_top/rtl/src/decred_defines.v
new file mode 100755
index 0000000..dd6e4f7
--- /dev/null
+++ b/verilog/rtl/decred_top/rtl/src/decred_defines.v
@@ -0,0 +1,23 @@
+// Copyright 2020 Matt Aamold, James Aamold
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+//     http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+// Language: Verilog 2001
+
+// Default setting marked with D for enabled
+`define NUMBER_OF_MACROS 2 // -- value required
+`define USE_REG_WRITE_TO_HASHMACRO // D-- register write ops to hash macros
+`define USE_VARIABLE_NONCE_OFFSET // D--
+//`define USE_SYSTEM_VERILOG //  -- 
+`define USE_NONBLOCKING_HASH_MACRO // D-- comment-out for blocking
+//`define FULL_CHIP_SIM //  --
diff --git a/verilog/rtl/decred_top/rtl/src/decred_top.v b/verilog/rtl/decred_top/rtl/src/decred_top.v
new file mode 100755
index 0000000..0ff84b0
--- /dev/null
+++ b/verilog/rtl/decred_top/rtl/src/decred_top.v
@@ -0,0 +1,103 @@
+// Copyright 2020 Matt Aamold, James Aamold
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+//     http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+// Language: Verilog 2001
+
+`timescale 1ns / 1ps
+
+`include "decred_defines.v"
+
+module decred_top (
+`ifdef USE_POWER_PINS
+    inout vdda1,	// User area 1 3.3V supply
+    inout vdda2,	// User area 2 3.3V supply
+    inout vssa1,	// User area 1 analog ground
+    inout vssa2,	// User area 2 analog ground
+    inout vccd1,	// User area 1 1.8V supply
+    inout vccd2,	// User area 2 1.8v supply
+    inout vssd1,	// User area 1 digital ground
+    inout vssd2,	// User area 2 digital ground
+`endif
+  input  wire  EXT_RESET_N_fromHost,
+  input  wire  SCLK_fromHost,
+  input  wire  M1_CLK_IN,
+  input  wire  M1_CLK_SELECT,
+  input  wire  PLL_INPUT,
+  input  wire  S1_CLK_IN,
+  input  wire  S1_CLK_SELECT,
+  input  wire  SCSN_fromHost,
+  input  wire  MOSI_fromHost,
+  input  wire  MISO_fromClient,
+  input  wire  IRQ_OUT_fromClient,
+  input  wire  ID_fromClient,
+
+  input  wire  SPI_CLK_RESET_N,
+
+  output wire  SCSN_toClient,
+  output wire  SCLK_toClient,
+  output wire  MOSI_toClient,
+  output wire  EXT_RESET_N_toClient,
+  output wire  ID_toHost,
+
+  output wire  CLK_LED,
+  output wire  MISO_toHost,
+  output wire  HASH_LED,
+  output wire  IRQ_OUT_toHost
+  );
+
+  // //////////////////////////////////////////////////////
+  // Clocking
+
+  // M1 clock is sourced from pin or PLL
+  wire m1_clk_internal;
+  assign m1_clk_internal = (M1_CLK_SELECT) ? M1_CLK_IN : PLL_INPUT;
+
+  // S1 clock is sourced from pin or divider
+  wire s1_clk_internal;
+  wire s1_div_output;
+
+  clock_div #(.SIZE(3)) clock_divBlock (
+    .in(m1_clk_internal),
+    .out(s1_div_output),
+    .N(3'h6),
+    .resetb(SPI_CLK_RESET_N)
+  );
+
+  assign s1_clk_internal = (S1_CLK_SELECT) ? S1_CLK_IN : s1_div_output;
+
+  decred decred_macro (
+    .EXT_RESET_N_fromHost(EXT_RESET_N_fromHost),
+    .SCLK_fromHost(SCLK_fromHost),
+    .M1_CLK(m1_clk_internal),
+    .SPI_CLK(s1_clk_internal),
+    .SCSN_fromHost(SCSN_fromHost),
+    .MOSI_fromHost(MOSI_fromHost),
+    .MISO_fromClient(MISO_fromClient),
+    .IRQ_OUT_fromClient(IRQ_OUT_fromClient),
+    .ID_fromClient(ID_fromClient),
+
+    .SCSN_toClient(SCSN_toClient),
+    .SCLK_toClient(SCLK_toClient),
+    .MOSI_toClient(MOSI_toClient),
+    .EXT_RESET_N_toClient(EXT_RESET_N_toClient),
+    .ID_toHost(ID_toHost),
+
+    .CLK_LED(CLK_LED),
+    .MISO_toHost(MISO_toHost),
+    .HASH_LED(HASH_LED),
+    .IRQ_OUT_toHost(IRQ_OUT_toHost),
+    .hash_clock_reset()
+  );
+
+endmodule // decred_top
diff --git a/verilog/rtl/decred_top/rtl/src/hash_macro_nonblock.v b/verilog/rtl/decred_top/rtl/src/hash_macro_nonblock.v
new file mode 100755
index 0000000..204cecf
--- /dev/null
+++ b/verilog/rtl/decred_top/rtl/src/hash_macro_nonblock.v
@@ -0,0 +1,1376 @@
+// Copyright 2020 Matt Aamold, James Aamold
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+//     http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+// Language: Verilog 2001
+
+`include "decred_defines.v"
+
+`ifdef USE_SYSTEM_VERILOG
+`define CS_INDEX(a) \
+	480-(a*32)+:32
+
+`define SIGMA_INDEX(a,b) \
+	892-(a*64)-(b*4)+:4
+`endif
+
+`define NUM_OF_THREADS 6
+
+module blake256_qr(
+		input wire clk,
+		input wire [31 : 0]  m0,
+		input wire [31 : 0]  m1,
+		input wire [31 : 0]  cs0,
+		input wire [31 : 0]  cs1,
+
+		input wire [31 : 0]  a,
+		input wire [31 : 0]  b,
+		input wire [31 : 0]  c,
+		input wire [31 : 0]  d,
+
+		output wire [31 : 0] a_prim,
+		output wire [31 : 0] b_prim,
+		output wire [31 : 0] c_prim,
+		output wire [31 : 0] d_prim
+);
+
+  //----------------------------------------------------------------
+  // QR module regs/wires
+  //----------------------------------------------------------------
+  reg [31 : 0] internal_a_prim;
+  reg [31 : 0] internal_b_prim;
+  reg [31 : 0] internal_c_prim;
+  reg [31 : 0] internal_d_prim;
+
+  assign a_prim = internal_a_prim;
+  assign b_prim = internal_b_prim;
+  assign c_prim = internal_c_prim;
+  assign d_prim = internal_d_prim;
+
+  reg [31 : 0] a_reg[2:0];
+  reg [31 : 0] b_reg[2:0];
+  reg [31 : 0] c_reg[2:0];
+  reg [31 : 0] d_reg[2:0];
+  
+  reg [31 : 0] a_reg2[2:0];
+  reg [31 : 0] c_reg2[2:0];
+  reg [31 : 0] d_reg2[2:0];
+
+  reg [31 : 0] m1_cs0_reg[1:0];
+
+  //----------------------------------------------------------------
+  // qr
+  //
+  // The actual quarterround engine
+  //----------------------------------------------------------------
+  always @(posedge clk)
+    begin : qr
+	   
+		//a ? a + b + (msr(2i) ? csr(2i+1))
+		//d ? (d?a)»16
+		//c?c+d
+		//b ? (b?c)»12
+		//a ? a + b + (msr(2i+1) ? csr(2i))
+		//d ? (d?a)»8
+		//c?c+d
+		//b ? (b?c)»7
+		
+		a_reg[0] <= (a + b + (m0 ^ cs1));
+		b_reg[0] <= b;
+		c_reg[0] <= c;
+		d_reg[0] <= d;
+		// register duplication
+		a_reg2[0] <= (a + b + (m0 ^ cs1));
+		c_reg2[0] <= c;
+		d_reg2[0] <= d;
+		m1_cs0_reg[0] <= (m1 ^ cs0);
+		
+		a_reg[1] <= a_reg[0];
+		b_reg[1] <= (b_reg[0] ^ (c_reg2[0] + (((d_reg2[0] ^ a_reg2[0]) << 16) | ((d_reg2[0] ^ a_reg2[0])  >> 16))));
+		c_reg[1] <= (c_reg[0] + (((d_reg[0] ^ a_reg[0]) << 16) | ((d_reg[0] ^ a_reg[0])  >> 16)));
+		d_reg[1] <= (((d_reg[0] ^ a_reg[0]) << 16) | ((d_reg[0] ^ a_reg[0])  >> 16));
+		m1_cs0_reg[1] <= m1_cs0_reg[0];
+
+		a_reg[2] <= (a_reg[1] + ((b_reg[1] << 20) | (b_reg[1] >> 12)) + m1_cs0_reg[1]);
+		b_reg[2] <= ((b_reg[1] << 20) | (b_reg[1] >> 12));
+		c_reg[2] <= c_reg[1];
+		d_reg[2] <= (d_reg[1] ^ (a_reg[1] + ((b_reg[1] << 20) | (b_reg[1] >> 12)) + m1_cs0_reg[1]));
+
+      internal_a_prim <= a_reg[2];
+      internal_b_prim <= (((b_reg[2] ^ (c_reg[2] + ((d_reg[2] << 24) | (d_reg[2] >> 8)))) << 25) | ((b_reg[2] ^ (c_reg[2] + ((d_reg[2] << 24) | (d_reg[2] >> 8)))) >> 7));
+      internal_c_prim <= (c_reg[2] + ((d_reg[2] << 24) | (d_reg[2] >> 8)));
+      internal_d_prim <= ((d_reg[2] << 24) | (d_reg[2] >> 8));
+		
+		/*a_reg[0] <= (a + b + (m0 ^ cs1));
+		b_reg[0] <= (b ^ (c + (((d ^ (a + b + (m0 ^ cs1))) << 16) | ((d ^ (a + b + (m0 ^ cs1)))  >> 16))));
+		c_reg[0] <= (c + (((d ^ (a + b + (m0 ^ cs1))) << 16) | ((d ^ (a + b + (m0 ^ cs1)))  >> 16)));
+		d_reg[0] <= (((d ^ (a + b + (m0 ^ cs1))) << 16) | ((d ^ (a + b + (m0 ^ cs1)))  >> 16));
+		m1_cs0_reg <= (m1 ^ cs0);
+
+		a_reg[1] <= (a_reg[0] + ((b_reg[0] << 20) | (b_reg[0] >> 12)) + m1_cs0_reg);
+		b_reg[1] <= ((b_reg[0] << 20) | (b_reg[0] >> 12));
+		c_reg[1] <= c_reg[0];
+		d_reg[1] <= (d_reg[0] ^ (a_reg[0] + ((b_reg[0] << 20) | (b_reg[0] >> 12)) + m1_cs0_reg));
+
+      internal_a_prim <= a_reg[1];
+      internal_b_prim <= (((b_reg[1] ^ (c_reg[1] + ((d_reg[1] << 24) | (d_reg[1] >> 8)))) << 25) | ((b_reg[1] ^ (c_reg[1] + ((d_reg[1] << 24) | (d_reg[1] >> 8)))) >> 7));
+      internal_c_prim <= (c_reg[1] + ((d_reg[1] << 24) | (d_reg[1] >> 8)));
+      internal_d_prim <= ((d_reg[1] << 24) | (d_reg[1] >> 8));
+		*/
+
+    end // qr
+endmodule // blake256_qr
+
+`ifndef USE_SYSTEM_VERILOG
+module Sigma_CS#(
+  parameter QR_OFFSET=0
+)(
+		input wire [3  : 0]  round,
+		input wire           qr_base,
+
+		output wire [3  : 0] sigma0_out,
+		output wire [3  : 0] sigma1_out
+);
+
+  //----------------------------------------------------------------
+  // Sigma_CS module regs/wires
+  //----------------------------------------------------------------
+  reg [3  : 0] internal_sigma0_out;
+  reg [3  : 0] internal_sigma1_out;
+
+  assign sigma0_out = internal_sigma0_out;
+  assign sigma1_out = internal_sigma1_out;
+
+  //----------------------------------------------------------------
+  // Sigma_CS lookup
+  //----------------------------------------------------------------
+  always @*
+    begin : s_cs
+      case (QR_OFFSET)
+		  /* Offset 0 */
+		  0: begin
+  	       case ({round,qr_base})
+		      5'b00000,
+		      5'b10100: begin 
+		        internal_sigma0_out = 4'd0;
+			     internal_sigma1_out = 4'd1;
+		      end
+			   5'b00001,
+		      5'b10101: begin 
+		        internal_sigma0_out = 4'd8;
+				  internal_sigma1_out = 4'd9;
+		      end
+				
+				5'b00010,
+				5'b10110: begin 
+				  internal_sigma0_out = 4'd14;
+				  internal_sigma1_out = 4'd10;
+			   end
+				5'b00011,
+				5'b10111: begin 
+				  internal_sigma0_out = 4'd1;
+				  internal_sigma1_out = 4'd12;
+			   end
+				
+				5'b00100,
+				5'b11000: begin 
+				  internal_sigma0_out = 4'd11;
+				  internal_sigma1_out = 4'd8;
+				end
+				5'b00101,
+				5'b11001: begin 
+				  internal_sigma0_out = 4'd10;
+				  internal_sigma1_out = 4'd14;
+				end
+				
+				5'b00110,
+				5'b11010: begin 
+				  internal_sigma0_out = 4'd7;
+				  internal_sigma1_out = 4'd9;
+				end
+				5'b00111,
+				5'b11011: begin 
+				  internal_sigma0_out = 4'd2;
+				  internal_sigma1_out = 4'd6;
+				end
+				
+				5'b01000: begin 
+				  internal_sigma0_out = 4'd9;
+				  internal_sigma1_out = 4'd0;
+				end
+				5'b01001: begin 
+				  internal_sigma0_out = 4'd14;
+				  internal_sigma1_out = 4'd1;
+				end
+				
+				5'b01010: begin 
+				  internal_sigma0_out = 4'd2;
+				  internal_sigma1_out = 4'd12;
+				end
+				5'b01011: begin 
+				  internal_sigma0_out = 4'd4;
+				  internal_sigma1_out = 4'd13;
+				end
+				
+				5'b01100: begin 
+				  internal_sigma0_out = 4'd12;
+				  internal_sigma1_out = 4'd5;
+				end
+				5'b01101: begin 
+				  internal_sigma0_out = 4'd0;
+				  internal_sigma1_out = 4'd7;
+				end
+				
+				5'b01110: begin 
+				  internal_sigma0_out = 4'd13;
+				  internal_sigma1_out = 4'd11;
+				end
+				5'b01111: begin 
+				  internal_sigma0_out = 4'd5;
+				  internal_sigma1_out = 4'd0;
+				end
+				
+				5'b10000: begin 
+				  internal_sigma0_out = 4'd6;
+				  internal_sigma1_out = 4'd15;
+				end
+				5'b10001: begin 
+				  internal_sigma0_out = 4'd12;
+				  internal_sigma1_out = 4'd2;
+				end
+				
+				5'b10010: begin 
+				  internal_sigma0_out = 4'd10;
+				  internal_sigma1_out = 4'd2;
+				end
+				5'b10011: begin 
+				  internal_sigma0_out = 4'd15;
+				  internal_sigma1_out = 4'd11;
+				end
+				default: begin
+				  internal_sigma0_out = 4'd0;
+				  internal_sigma1_out = 4'd0;
+			    end
+			 endcase // ({round,qr_base}) /* Offset 0 */
+		  end // 0
+
+		  /* Offset 1 */
+		  1: begin
+		    case ({round,qr_base})
+			   5'b00000,
+				5'b10100: begin 
+				  internal_sigma0_out = 4'd2;
+				  internal_sigma1_out = 4'd3;
+				end
+				5'b00001,
+				5'b10101: begin 
+				  internal_sigma0_out = 4'd10;
+				  internal_sigma1_out = 4'd11;
+				end
+				
+				5'b00010,
+				5'b10110: begin 
+				  internal_sigma0_out = 4'd4;
+				  internal_sigma1_out = 4'd8;
+				end
+				5'b00011,
+				5'b10111: begin 
+				  internal_sigma0_out = 4'd0;
+				  internal_sigma1_out = 4'd2;
+				end
+				
+				5'b00100,
+				5'b11000: begin 
+				  internal_sigma0_out = 4'd12;
+				  internal_sigma1_out = 4'd0;
+				end
+				5'b00101,
+				5'b11001: begin 
+				  internal_sigma0_out = 4'd3;
+				  internal_sigma1_out = 4'd6;
+				end
+				
+				5'b00110,
+				5'b11010: begin 
+				  internal_sigma0_out = 4'd3;
+				  internal_sigma1_out = 4'd1;
+				end
+				5'b00111,
+				5'b11011: begin 
+				  internal_sigma0_out = 4'd5;
+				  internal_sigma1_out = 4'd10;
+				end
+				
+				5'b01000: begin 
+				  internal_sigma0_out = 4'd5;
+				  internal_sigma1_out = 4'd7;
+				end
+				5'b01001: begin 
+				  internal_sigma0_out = 4'd11;
+				  internal_sigma1_out = 4'd12;
+				end
+				
+				5'b01010: begin 
+				  internal_sigma0_out = 4'd6;
+				  internal_sigma1_out = 4'd10;
+				end
+				5'b01011: begin 
+				  internal_sigma0_out = 4'd7;
+				  internal_sigma1_out = 4'd5;
+				end
+				
+				5'b01100: begin 
+				  internal_sigma0_out = 4'd1;
+				  internal_sigma1_out = 4'd15;
+				end
+				5'b01101: begin 
+				  internal_sigma0_out = 4'd6;
+				  internal_sigma1_out = 4'd3;
+				end
+				
+				5'b01110: begin 
+				  internal_sigma0_out = 4'd7;
+				  internal_sigma1_out = 4'd14;
+				end
+				5'b01111: begin 
+				  internal_sigma0_out = 4'd15;
+				  internal_sigma1_out = 4'd4;
+				end
+				
+				5'b10000: begin 
+				  internal_sigma0_out = 4'd14;
+				  internal_sigma1_out = 4'd9;
+				end
+				5'b10001: begin 
+				  internal_sigma0_out = 4'd13;
+				  internal_sigma1_out = 4'd7;
+				end
+				
+				5'b10010: begin 
+				  internal_sigma0_out = 4'd8;
+				  internal_sigma1_out = 4'd4;
+				end
+				5'b10011: begin 
+				  internal_sigma0_out = 4'd9;
+				  internal_sigma1_out = 4'd14;
+				end
+				default: begin
+				  internal_sigma0_out = 4'd0;
+				  internal_sigma1_out = 4'd0;
+			    end
+			 endcase // ({round,qr_base}) /* Offset 1 */
+		  end // 1  
+		  
+		  /* Offset 2 */
+		  2: begin
+		    case ({round,qr_base})
+				5'b00000,
+				5'b10100: begin 
+				  internal_sigma0_out = 4'd4;
+				  internal_sigma1_out = 4'd5;
+				end
+				5'b00001,
+				5'b10101: begin 
+				  internal_sigma0_out = 4'd12;
+				  internal_sigma1_out = 4'd13;
+				end
+				
+				5'b00010,
+				5'b10110: begin 
+				  internal_sigma0_out = 4'd9;
+				  internal_sigma1_out = 4'd15;
+				end
+				5'b00011,
+				5'b10111: begin 
+				  internal_sigma0_out = 4'd11;
+				  internal_sigma1_out = 4'd7;
+				end
+				
+				5'b00100,
+				5'b11000: begin 
+				  internal_sigma0_out = 4'd5;
+				  internal_sigma1_out = 4'd2;
+				end
+				5'b00101,
+				5'b11001: begin 
+				  internal_sigma0_out = 4'd7;
+				  internal_sigma1_out = 4'd1;
+				end
+				
+				5'b00110,
+				5'b11010: begin 
+				  internal_sigma0_out = 4'd13;
+				  internal_sigma1_out = 4'd12;
+				end
+				5'b00111,
+				5'b11011: begin 
+				  internal_sigma0_out = 4'd4;
+				  internal_sigma1_out = 4'd0;
+				end
+				
+				5'b01000: begin 
+				  internal_sigma0_out = 4'd2;
+				  internal_sigma1_out = 4'd4;
+				end
+				5'b01001: begin 
+				  internal_sigma0_out = 4'd6;
+				  internal_sigma1_out = 4'd8;
+				end
+				
+				5'b01010: begin 
+				  internal_sigma0_out = 4'd0;
+				  internal_sigma1_out = 4'd11;
+				end
+				5'b01011: begin 
+				  internal_sigma0_out = 4'd15;
+				  internal_sigma1_out = 4'd14;
+				end
+				
+				5'b01100: begin 
+				  internal_sigma0_out = 4'd14;
+				  internal_sigma1_out = 4'd13;
+				end
+				5'b01101: begin 
+				  internal_sigma0_out = 4'd9;
+				  internal_sigma1_out = 4'd2;
+				end
+				
+				5'b01110: begin 
+				  internal_sigma0_out = 4'd12;
+				  internal_sigma1_out = 4'd1;
+				end
+				5'b01111: begin 
+				  internal_sigma0_out = 4'd8;
+				  internal_sigma1_out = 4'd6;
+				end
+				
+				5'b10000: begin 
+				  internal_sigma0_out = 4'd11;
+				  internal_sigma1_out = 4'd3;
+				end
+				5'b10001: begin 
+				  internal_sigma0_out = 4'd1;
+				  internal_sigma1_out = 4'd4;
+				end
+				
+				5'b10010: begin 
+				  internal_sigma0_out = 4'd7;
+				  internal_sigma1_out = 4'd6;
+				end
+				5'b10011: begin 
+				  internal_sigma0_out = 4'd3;
+				  internal_sigma1_out = 4'd12;
+				end
+				default: begin
+				  internal_sigma0_out = 4'd0;
+				  internal_sigma1_out = 4'd0;
+			    end
+			 endcase // ({round,qr_base}) /* Offset 2 */
+		  end // 2  
+		  
+		  /* Offset 3 */
+		  3: begin
+		    case ({round,qr_base})
+				5'b00000,
+				5'b10100: begin 
+				  internal_sigma0_out = 4'd6;
+				  internal_sigma1_out = 4'd7;
+				end
+				5'b00001,
+				5'b10101: begin 
+				  internal_sigma0_out = 4'd14;
+				  internal_sigma1_out = 4'd15;
+				end
+				
+				5'b00010,
+				5'b10110: begin 
+				  internal_sigma0_out = 4'd13;
+				  internal_sigma1_out = 4'd6;
+				end
+				5'b00011,
+				5'b10111: begin 
+				  internal_sigma0_out = 4'd5;
+				  internal_sigma1_out = 4'd3;
+				end
+		  
+				5'b00100,
+				5'b11000: begin 
+				  internal_sigma0_out = 4'd15;
+				  internal_sigma1_out = 4'd13;
+				end
+				5'b00101,
+				5'b11001: begin 
+				  internal_sigma0_out = 4'd9;
+				  internal_sigma1_out = 4'd4;
+				end
+		  
+				5'b00110,
+				5'b11010: begin 
+				  internal_sigma0_out = 4'd11;
+				  internal_sigma1_out = 4'd14;
+				end
+				5'b00111,
+				5'b11011: begin 
+				  internal_sigma0_out = 4'd15;
+				  internal_sigma1_out = 4'd8;
+				end
+		  
+				5'b01000: begin 
+				  internal_sigma0_out = 4'd10;
+				  internal_sigma1_out = 4'd15;
+				end
+				5'b01001: begin 
+				  internal_sigma0_out = 4'd3;
+				  internal_sigma1_out = 4'd13;
+				end
+		
+				5'b01010: begin 
+				  internal_sigma0_out = 4'd8;
+				  internal_sigma1_out = 4'd3;
+				end
+				5'b01011: begin 
+				  internal_sigma0_out = 4'd1;
+				  internal_sigma1_out = 4'd9;
+				end
+
+				5'b01100: begin 
+				  internal_sigma0_out = 4'd4;
+				  internal_sigma1_out = 4'd10;
+				end
+				5'b01101: begin 
+				  internal_sigma0_out = 4'd8;
+				  internal_sigma1_out = 4'd11;
+				end
+		  
+				5'b01110: begin 
+				  internal_sigma0_out = 4'd3;
+				  internal_sigma1_out = 4'd9;
+				end
+				5'b01111: begin 
+				  internal_sigma0_out = 4'd2;
+				  internal_sigma1_out = 4'd10;
+				end
+				
+				5'b10000: begin 
+				  internal_sigma0_out = 4'd0;
+				  internal_sigma1_out = 4'd8;
+				end
+				5'b10001: begin 
+				  internal_sigma0_out = 4'd10;
+				  internal_sigma1_out = 4'd5;
+				end
+				
+				5'b10010: begin 
+				  internal_sigma0_out = 4'd1;
+				  internal_sigma1_out = 4'd5;
+				end
+				5'b10011: begin 
+				  internal_sigma0_out = 4'd13;
+				  internal_sigma1_out = 4'd0;
+				end
+				default: begin
+				  internal_sigma0_out = 4'd0;
+				  internal_sigma1_out = 4'd0;
+			    end
+			 endcase // ({round,qr_base}) /* Offset 3 */
+		  end // 3
+		  default: begin
+		    internal_sigma0_out = 4'd0;
+			internal_sigma1_out = 4'd0;
+		  end
+		endcase
+    end // s_cs
+endmodule // Sigma_CS
+`endif
+
+// threads, q_step_base (0,4), input_cols, input_diags, mreg_update, mreg_ptr, qr_output_ptr, output_cols, output_diags, ctr_check, ctr_ptr
+`define NEXT_STATE(stateName, thread, q_step_base, input_cols, input_diags, mreg_update, mreg_ptr, qr_output_ptr, output_cols, output_diags, ctr_check, ctr_ptr) \
+    qr_ctr_reg[0] <= q_step_base; \
+	 qr_ctr_reg[1] <= q_step_base+1; \
+	 qr_ctr_reg[2] <= q_step_base+2; \
+	 qr_ctr_reg[3] <= q_step_base+3; \
+	 input_qr_ptr <= thread; \
+	 input_qr_cols <= input_cols; \
+	 input_qr_diags <= input_diags; \
+	 update_mreg <= mreg_update; \
+	 update_mreg_ptr <= mreg_ptr; \
+	 output_qr_ptr <= qr_output_ptr; \
+	 output_qr_cols <= output_cols; \
+	 output_qr_diags <= output_diags; \
+	 ctr_reg_check <= ctr_check; \
+	 ctr_reg_ptr <= ctr_ptr; \
+    main_sr_state_reg <= stateName; 
+
+module blake256r14_core_nonblock #(
+  parameter UPPER_NONCE_START=0,
+  parameter NONCE_START=0,
+  parameter NONCE_STRIDE=`NUM_OF_THREADS
+)(
+
+		input wire            CLK,
+		input wire            HASH_EN,
+`ifdef USE_REG_WRITE_TO_HASHMACRO
+		input wire            MACRO_WR_SELECT,
+		input wire [7  : 0]   DATA_IN,
+`else
+		input wire  [255 : 0] MIDSTATE,
+		input wire  [127 : 0] HEADERDATA,
+		input wire  [31  : 0] ENONCE_IN,
+		input wire  [31  : 0] TARGET_MASK,
+`endif
+		input wire            MACRO_RD_SELECT,
+		input wire	[5   : 0] ADDR_IN,
+
+		output wire	[3   : 0] THREAD_COUNT,
+
+		output wire           DATA_AVAILABLE,
+		output wire [7  : 0]  DATA_OUT
+);
+
+`ifdef USE_VARIABLE_NONCE_OFFSET
+ `ifdef FULL_CHIP_SIM
+  `define NONCE_START_VAL  ({registers[58], registers[57], registers[56]})
+  `define NONCE_STRIDE_VAL ({registers[60], registers[59]})
+  `else
+  `define NONCE_START_VAL  ({registers[57], registers[56]})
+  `define NONCE_STRIDE_VAL ({registers[59], registers[58]})
+ `endif
+`else
+  `define NONCE_START_VAL  NONCE_START
+  `define NONCE_STRIDE_VAL NONCE_STRIDE
+`endif
+
+`ifdef USE_REG_WRITE_TO_HASHMACRO
+
+  wire  [255 : 0] MIDSTATE;
+  wire  [127 : 0] HEADERDATA;
+  wire  [31  : 0] ENONCE_IN;
+  wire  [31  : 0] TARGET_MASK;
+`endif
+
+  //----------------------------------------------------------------
+  // Internal constant and parameter definitions
+  //----------------------------------------------------------------
+
+  localparam V8_INIT  = 32'h243F6A88;
+  localparam V9_INIT  = 32'h85A308D3;
+  localparam V10_INIT = 32'h13198A2E;
+  localparam V11_INIT = 32'h03707344;
+  localparam V12_INIT = 32'ha4093d82;
+  localparam V13_INIT = 32'h299F3470;
+  localparam V14_INIT = 32'h082EFA98;
+  localparam V15_INIT = 32'hEC4E6C89;
+  localparam M13_INIT = 32'h80000001;
+  localparam M15_INIT = 32'h000005a0;
+
+  localparam BLAKE256R14_ROUNDS = 4'd14;
+  
+  localparam NUM_THREADS = 6;
+  
+  localparam NUM_ENGINES = 4;
+  
+  genvar i;
+  
+`ifdef USE_SYSTEM_VERILOG
+  // CS[16]
+  localparam [511:0] CS = {
+		32'h243F6A88, 32'h85A308D3, 32'h13198A2E, 32'h03707344,
+		32'hA4093822, 32'h299F31D0, 32'h082EFA98, 32'hEC4E6C89,
+		32'h452821E6, 32'h38D01377, 32'hBE5466CF, 32'h34E90C6C,
+		32'hC0AC29B7, 32'hC97C50DD, 32'h3F84D5B5, 32'hB5470917};
+		
+  localparam [895:0] SIGMA = {
+		4'd0, 4'd1, 4'd2, 4'd3, 4'd4, 4'd5, 4'd6, 4'd7, 4'd8, 4'd9, 4'd10,4'd11,4'd12,4'd13,4'd14,4'd15,
+		4'd14,4'd10,4'd4, 4'd8, 4'd9, 4'd15,4'd13,4'd6, 4'd1, 4'd12,4'd0, 4'd2, 4'd11,4'd7, 4'd5, 4'd3,
+		4'd11,4'd8, 4'd12,4'd0, 4'd5, 4'd2, 4'd15,4'd13,4'd10,4'd14,4'd3, 4'd6, 4'd7, 4'd1, 4'd9, 4'd4,
+		4'd7, 4'd9, 4'd3, 4'd1, 4'd13,4'd12,4'd11,4'd14,4'd2, 4'd6, 4'd5, 4'd10,4'd4, 4'd0, 4'd15,4'd8,
+		4'd9, 4'd0, 4'd5, 4'd7, 4'd2, 4'd4, 4'd10,4'd15,4'd14,4'd1, 4'd11,4'd12,4'd6, 4'd8, 4'd3, 4'd13,
+		4'd2, 4'd12,4'd6, 4'd10,4'd0, 4'd11,4'd8, 4'd3, 4'd4, 4'd13,4'd7, 4'd5, 4'd15,4'd14,4'd1, 4'd9,
+		4'd12,4'd5, 4'd1, 4'd15,4'd14,4'd13,4'd4, 4'd10,4'd0, 4'd7, 4'd6, 4'd3, 4'd9, 4'd2, 4'd8, 4'd11,
+		4'd13,4'd11,4'd7, 4'd14,4'd12,4'd1, 4'd3, 4'd9, 4'd5, 4'd0, 4'd15,4'd4, 4'd8, 4'd6, 4'd2, 4'd10,
+		4'd6, 4'd15,4'd14,4'd9, 4'd11,4'd3, 4'd0, 4'd8, 4'd12,4'd2, 4'd13,4'd7, 4'd1, 4'd4, 4'd10,4'd5,
+		4'd10,4'd2, 4'd8, 4'd4, 4'd7, 4'd6, 4'd1, 4'd5, 4'd15,4'd11,4'd9, 4'd14,4'd3, 4'd12,4'd13,4'd0,
+		4'd0, 4'd1, 4'd2, 4'd3, 4'd4, 4'd5, 4'd6, 4'd7, 4'd8, 4'd9, 4'd10,4'd11,4'd12,4'd13,4'd14,4'd15,
+		4'd14,4'd10,4'd4, 4'd8, 4'd9, 4'd15,4'd13,4'd6, 4'd1, 4'd12,4'd0, 4'd2, 4'd11,4'd7, 4'd5, 4'd3,
+		4'd11,4'd8, 4'd12,4'd0, 4'd5, 4'd2, 4'd15,4'd13,4'd10,4'd14,4'd3, 4'd6, 4'd7, 4'd1, 4'd9, 4'd4,
+		4'd7, 4'd9, 4'd3, 4'd1, 4'd13,4'd12,4'd11,4'd14,4'd2, 4'd6, 4'd5, 4'd10,4'd4, 4'd0, 4'd15,4'd8};
+`endif
+
+  assign THREAD_COUNT = `NUM_OF_THREADS;
+
+  //----------------------------------------------------------------
+  // QR instance regs/wires
+  //----------------------------------------------------------------
+  
+  reg [31 : 0]  qr_m0[3:0];
+  reg [31 : 0]  qr_m1[3:0];
+  reg [31 : 0]  qr_cs0[3:0];
+  reg [31 : 0]  qr_cs1[3:0];
+  reg [31 : 0]  qr_a[3:0];
+  reg [31 : 0]  qr_b[3:0];
+  reg [31 : 0]  qr_c[3:0];
+  reg [31 : 0]  qr_d[3:0];
+  wire [31 : 0] qr_a_prim[3:0];
+  wire [31 : 0] qr_b_prim[3:0];
+  wire [31 : 0] qr_c_prim[3:0];
+  wire [31 : 0] qr_d_prim[3:0];
+
+  //----------------------------------------------------------------
+  // Instantiation of the qr modules
+  //----------------------------------------------------------------
+  for (i = 0; i < NUM_ENGINES; i = i + 1) begin: qr_engine_multi_block
+    blake256_qr qr(
+		.clk(CLK),
+		.m0(qr_m0[i]),
+		.m1(qr_m1[i]),
+		.cs0(qr_cs0[i]),
+		.cs1(qr_cs1[i]),
+
+	   .a(qr_a[i]),
+      .b(qr_b[i]),
+      .c(qr_c[i]),
+      .d(qr_d[i]),
+
+      .a_prim(qr_a_prim[i]),
+      .b_prim(qr_b_prim[i]),
+      .c_prim(qr_c_prim[i]),
+      .d_prim(qr_d_prim[i])
+    );
+  end
+  
+  //----------------------------------------------------------------
+  // Superround regs/wires
+  //----------------------------------------------------------------
+  reg [31 : 0]  v_reg [NUM_THREADS-1:0][15 : 0];
+  reg [31 : 0]  m_reg [15 : 0];
+
+  reg [31 : 0]  m_reg3 [NUM_THREADS-1: 0];
+  reg [31 : 0]  m_reg4 [NUM_THREADS-1: 0];
+
+  reg           value_ready_for_test;
+  
+  reg [63 : 0]  target_check;
+  
+  reg [31 : 0]  m_save[1 : 0];
+  
+  reg [3   : 0] qr_ctr_reg[3:0];
+  reg [3   : 0] ctr_reg[NUM_THREADS-1: 0];
+
+  reg				 init_vreg_state;
+  reg  		 	 update_vregs;
+  reg [3 :0] 	 update_vreg_ptr;
+  
+  reg 			 update_mreg;
+  reg [3 :0]	 update_mreg_ptr;
+  
+  reg				 input_qr_cols;
+  reg				 input_qr_diags;
+  reg [3 :0]	 input_qr_ptr;
+  
+  reg				 output_qr_cols;
+  reg				 output_qr_diags;
+  reg [3 :0]	 output_qr_ptr;
+  
+  reg				 ctr_reg_check;
+  reg [3 :0]	 ctr_reg_ptr;
+
+  reg [13 : 0] main_sr_state_reg;
+  
+`ifndef USE_SYSTEM_VERILOG
+  reg [31 : 0]  CS[31:0];
+  wire [3  : 0] s0[3:0];
+  wire [3  : 0] s1[3:0];
+`endif
+
+  localparam MAIN_SR_STATE1  = 14'h0001;
+  localparam MAIN_SR_STATE2  = 14'h0002;
+  localparam MAIN_SR_STATE3  = 14'h0004;
+  localparam MAIN_SR_STATE4  = 14'h0008;
+  localparam MAIN_SR_STATE5  = 14'h0010;
+  localparam MAIN_SR_STATE6  = 14'h0020;
+  localparam MAIN_SR_STATE7  = 14'h0040;
+  localparam MAIN_SR_STATE8  = 14'h0080;
+  localparam MAIN_SR_STATE9  = 14'h0100;
+  localparam MAIN_SR_STATE10 = 14'h0200;
+  localparam MAIN_SR_STATE11 = 14'h0400;
+  localparam MAIN_SR_STATE12 = 14'h0800;
+  localparam MAIN_SR_STATE13 = 14'h1000;
+  localparam MAIN_SR_STATE14 = 14'h2000;
+
+`ifndef USE_SYSTEM_VERILOG
+  //----------------------------------------------------------------
+  // Instantiation of the sigma muxes
+  //----------------------------------------------------------------
+  for (i = 0; i < NUM_ENGINES; i = i + 1) begin: sigma_multi_block
+    Sigma_CS #(.QR_OFFSET(i)) sigmacs(
+		.round(ctr_reg[input_qr_ptr]),
+		.qr_base(qr_ctr_reg[i][2]),
+
+		.sigma0_out(s0[i]),
+		.sigma1_out(s1[i])
+    );
+  end
+`endif
+
+  //----------------------------------------------------------------
+  // M0/M1/CS0/CS1 Selector
+  //----------------------------------------------------------------
+  always @ (posedge CLK)
+  begin
+
+    if (input_qr_cols || input_qr_diags) begin
+`ifdef USE_SYSTEM_VERILOG
+      qr_m0[0]  <= m_reg[SIGMA[`SIGMA_INDEX(ctr_reg[input_qr_ptr],(qr_ctr_reg[0]*2))]];
+	   qr_m1[0]  <= m_reg[SIGMA[`SIGMA_INDEX(ctr_reg[input_qr_ptr],((qr_ctr_reg[0]*2)+1))]];
+	   qr_cs0[0] <= CS[`CS_INDEX(SIGMA[`SIGMA_INDEX(ctr_reg[input_qr_ptr],(qr_ctr_reg[0]*2))])];
+	   qr_cs1[0] <= CS[`CS_INDEX(SIGMA[`SIGMA_INDEX(ctr_reg[input_qr_ptr],((qr_ctr_reg[0]*2)+1))])];
+	 
+	   qr_m0[1]  <= m_reg[SIGMA[`SIGMA_INDEX(ctr_reg[input_qr_ptr],(qr_ctr_reg[1]*2))]];
+	   qr_m1[1]  <= m_reg[SIGMA[`SIGMA_INDEX(ctr_reg[input_qr_ptr],((qr_ctr_reg[1]*2)+1))]];
+	   qr_cs0[1] <= CS[`CS_INDEX(SIGMA[`SIGMA_INDEX(ctr_reg[input_qr_ptr],(qr_ctr_reg[1]*2))])];
+	   qr_cs1[1] <= CS[`CS_INDEX(SIGMA[`SIGMA_INDEX(ctr_reg[input_qr_ptr],((qr_ctr_reg[1]*2)+1))])];
+	 
+	   qr_m0[2]  <= m_reg[SIGMA[`SIGMA_INDEX(ctr_reg[input_qr_ptr],(qr_ctr_reg[2]*2))]];
+	   qr_m1[2]  <= m_reg[SIGMA[`SIGMA_INDEX(ctr_reg[input_qr_ptr],((qr_ctr_reg[2]*2)+1))]];
+	   qr_cs0[2] <= CS[`CS_INDEX(SIGMA[`SIGMA_INDEX(ctr_reg[input_qr_ptr],(qr_ctr_reg[2]*2))])];
+	   qr_cs1[2] <= CS[`CS_INDEX(SIGMA[`SIGMA_INDEX(ctr_reg[input_qr_ptr],((qr_ctr_reg[2]*2)+1))])];
+	 
+	   qr_m0[3]  <= m_reg[SIGMA[`SIGMA_INDEX(ctr_reg[input_qr_ptr],(qr_ctr_reg[3]*2))]];
+	   qr_m1[3]  <= m_reg[SIGMA[`SIGMA_INDEX(ctr_reg[input_qr_ptr],((qr_ctr_reg[3]*2)+1))]];
+	   qr_cs0[3] <= CS[`CS_INDEX(SIGMA[`SIGMA_INDEX(ctr_reg[input_qr_ptr],(qr_ctr_reg[3]*2))])];
+	   qr_cs1[3] <= CS[`CS_INDEX(SIGMA[`SIGMA_INDEX(ctr_reg[input_qr_ptr],((qr_ctr_reg[3]*2)+1))])];
+`else
+      qr_m0[0] <= m_reg[s0[0]];
+	  qr_m1[0] <= m_reg[s1[0]];
+	  qr_cs0[0] <= CS[s0[0]];
+	  qr_cs1[0] <= CS[s1[0]];
+	  qr_m0[1] <= m_reg[s0[1]];
+	  qr_m1[1] <= m_reg[s1[1]];
+	  qr_cs0[1] <= CS[s0[1]];
+	  qr_cs1[1] <= CS[s1[1]];
+	  qr_m0[2] <= m_reg[s0[2]];
+	  qr_m1[2] <= m_reg[s1[2]];
+	  qr_cs0[2] <= CS[s0[2]];
+	  qr_cs1[2] <= CS[s1[2]];
+	  qr_m0[3] <= m_reg[s0[3]];
+	  qr_m1[3] <= m_reg[s1[3]];
+	  qr_cs0[3] <= CS[s0[3]];
+	  qr_cs1[3] <= CS[s1[3]];
+`endif
+    end
+  end
+  
+  //----------------------------------------------------------------
+  // v_reg initialization
+  //----------------------------------------------------------------
+  always @ (posedge CLK)
+  begin
+    
+	 if (update_vregs) begin
+		v_reg[update_vreg_ptr][00] <= MIDSTATE[255 : 224];
+		v_reg[update_vreg_ptr][01] <= MIDSTATE[223 : 192];
+		v_reg[update_vreg_ptr][02] <= MIDSTATE[191 : 160];
+		v_reg[update_vreg_ptr][03] <= MIDSTATE[159 : 128];
+		v_reg[update_vreg_ptr][04] <= MIDSTATE[127 :  96];
+		v_reg[update_vreg_ptr][05] <= MIDSTATE[95  :  64];
+		v_reg[update_vreg_ptr][06] <= MIDSTATE[63  :  32];
+		v_reg[update_vreg_ptr][07] <= MIDSTATE[31  :   0];
+		v_reg[update_vreg_ptr][08] <= V8_INIT;
+		v_reg[update_vreg_ptr][09] <= V9_INIT;
+		v_reg[update_vreg_ptr][10] <= V10_INIT;
+		v_reg[update_vreg_ptr][11] <= V11_INIT;
+		v_reg[update_vreg_ptr][12] <= V12_INIT;
+		v_reg[update_vreg_ptr][13] <= V13_INIT;
+		v_reg[update_vreg_ptr][14] <= V14_INIT;
+		v_reg[update_vreg_ptr][15] <= V15_INIT;
+	 end
+  end
+	 
+
+  //----------------------------------------------------------------
+  // m_reg input
+  //----------------------------------------------------------------
+  always @ (posedge CLK)
+  begin
+    if (HASH_EN == 0)
+    begin
+	   // M0 = Block height
+	   m_reg[00] <= HEADERDATA[127:96 ];
+	   // M1 = Size (Number of bytes the serialized block occupies)
+	   m_reg[01] <= HEADERDATA[95 :64 ];
+	   // M2 = Timestamp of when the block was created
+	   m_reg[02] <= HEADERDATA[63 :32 ];
+		// M3/M4 = Nonce, starts at parameter values
+	   m_reg[03] <= 0;
+	   m_reg[04] <= 0;
+	   m_reg[05] <= ENONCE_IN[31 : 0 ];
+	   m_reg[06] <= 0;
+	   m_reg[07] <= 0;
+	   m_reg[08] <= 0;
+	   m_reg[09] <= 0;
+	   m_reg[10] <= 0;
+	   m_reg[11] <= 0;
+	   // M12 = Stake Version
+	   m_reg[12] <= HEADERDATA[31:0];
+	   m_reg[13] <= M13_INIT;
+	   m_reg[14] <= 0;
+	   m_reg[15] <= M15_INIT;
+	   
+`ifndef USE_SYSTEM_VERILOG
+	  // CS Constants
+	  CS[00] <= 32'h243F6A88;
+	  CS[01] <= 32'h85A308D3;
+	  CS[02] <= 32'h13198A2E;
+	  CS[03] <= 32'h03707344;
+	  CS[04] <= 32'hA4093822;
+	  CS[05] <= 32'h299F31D0;
+	  CS[06] <= 32'h082EFA98;
+	  CS[07] <= 32'hEC4E6C89;
+	  CS[08] <= 32'h452821E6;
+	  CS[09] <= 32'h38D01377;
+	  CS[10] <= 32'hBE5466CF;
+	  CS[11] <= 32'h34E90C6C;
+	  CS[12] <= 32'hC0AC29B7;
+	  CS[13] <= 32'hC97C50DD;
+	  CS[14] <= 32'h3F84D5B5;
+	  CS[15] <= 32'hB5470917;
+`endif
+	   
+	 end else begin
+      if (update_mreg) begin
+        m_reg[03] <= m_reg3[update_mreg_ptr];
+        m_reg[04] <= m_reg4[update_mreg_ptr];
+      end
+    end
+  end
+  
+  //----------------------------------------------------------------
+  // qr input
+  //----------------------------------------------------------------
+  always @ (posedge CLK)
+  begin
+	 if (input_qr_cols) begin
+		qr_a[0] <= v_reg[input_qr_ptr][00];
+      qr_b[0] <= v_reg[input_qr_ptr][04];
+      qr_c[0] <= v_reg[input_qr_ptr][08];
+      qr_d[0] <= v_reg[input_qr_ptr][12];
+			 
+		qr_a[1] <= v_reg[input_qr_ptr][01];
+      qr_b[1] <= v_reg[input_qr_ptr][05];
+      qr_c[1] <= v_reg[input_qr_ptr][09];
+      qr_d[1] <= v_reg[input_qr_ptr][13];
+			 
+		qr_a[2] <= v_reg[input_qr_ptr][02];
+      qr_b[2] <= v_reg[input_qr_ptr][06];
+      qr_c[2] <= v_reg[input_qr_ptr][10];
+      qr_d[2] <= v_reg[input_qr_ptr][14];
+			 
+		qr_a[3] <= v_reg[input_qr_ptr][03];
+      qr_b[3] <= v_reg[input_qr_ptr][07];
+      qr_c[3] <= v_reg[input_qr_ptr][11];
+      qr_d[3] <= v_reg[input_qr_ptr][15];
+	 end
+	 
+	 if (input_qr_diags) begin
+		qr_a[0] <= v_reg[input_qr_ptr][00];
+      qr_b[0] <= v_reg[input_qr_ptr][05];
+      qr_c[0] <= v_reg[input_qr_ptr][10];
+      qr_d[0] <= v_reg[input_qr_ptr][15];
+			 
+      qr_a[1] <= v_reg[input_qr_ptr][01];
+      qr_b[1] <= v_reg[input_qr_ptr][06];
+      qr_c[1] <= v_reg[input_qr_ptr][11];
+      qr_d[1] <= v_reg[input_qr_ptr][12];
+			 
+		qr_a[2] <= v_reg[input_qr_ptr][02];
+      qr_b[2] <= v_reg[input_qr_ptr][07];
+      qr_c[2] <= v_reg[input_qr_ptr][08];
+      qr_d[2] <= v_reg[input_qr_ptr][13];
+			 
+		qr_a[3] <= v_reg[input_qr_ptr][03];
+      qr_b[3] <= v_reg[input_qr_ptr][04];
+      qr_c[3] <= v_reg[input_qr_ptr][09];
+      qr_d[3] <= v_reg[input_qr_ptr][14];
+	 end
+  end
+
+  //----------------------------------------------------------------
+  // qr output
+  //----------------------------------------------------------------
+  always @ (posedge CLK)
+  begin
+	 
+	 if (output_qr_cols) begin
+	   v_reg[output_qr_ptr][00] <= qr_a_prim[0];
+      v_reg[output_qr_ptr][04] <= qr_b_prim[0];
+      v_reg[output_qr_ptr][08] <= qr_c_prim[0];
+      v_reg[output_qr_ptr][12] <= qr_d_prim[0];
+			 
+		v_reg[output_qr_ptr][01] <= qr_a_prim[1];
+      v_reg[output_qr_ptr][05] <= qr_b_prim[1];
+      v_reg[output_qr_ptr][09] <= qr_c_prim[1];
+      v_reg[output_qr_ptr][13] <= qr_d_prim[1];
+			 
+		v_reg[output_qr_ptr][02] <= qr_a_prim[2];
+      v_reg[output_qr_ptr][06] <= qr_b_prim[2];
+      v_reg[output_qr_ptr][10] <= qr_c_prim[2];
+      v_reg[output_qr_ptr][14] <= qr_d_prim[2];
+			 
+		v_reg[output_qr_ptr][03] <= qr_a_prim[3];
+      v_reg[output_qr_ptr][07] <= qr_b_prim[3];
+      v_reg[output_qr_ptr][11] <= qr_c_prim[3];
+      v_reg[output_qr_ptr][15] <= qr_d_prim[3];
+	 end
+	 
+	 if (output_qr_diags) begin
+	   v_reg[output_qr_ptr][00] <= qr_a_prim[0];
+      v_reg[output_qr_ptr][05] <= qr_b_prim[0];
+      v_reg[output_qr_ptr][10] <= qr_c_prim[0];
+      v_reg[output_qr_ptr][15] <= qr_d_prim[0];
+			 
+		v_reg[output_qr_ptr][01] <= qr_a_prim[1];
+      v_reg[output_qr_ptr][06] <= qr_b_prim[1];
+      v_reg[output_qr_ptr][11] <= qr_c_prim[1];
+      v_reg[output_qr_ptr][12] <= qr_d_prim[1];
+			 
+		v_reg[output_qr_ptr][02] <= qr_a_prim[2];
+      v_reg[output_qr_ptr][07] <= qr_b_prim[2];
+      v_reg[output_qr_ptr][08] <= qr_c_prim[2];
+      v_reg[output_qr_ptr][13] <= qr_d_prim[2];
+			 
+		v_reg[output_qr_ptr][03] <= qr_a_prim[3];
+      v_reg[output_qr_ptr][04] <= qr_b_prim[3];
+      v_reg[output_qr_ptr][09] <= qr_c_prim[3];
+      v_reg[output_qr_ptr][14] <= qr_d_prim[3];
+	 end
+  end
+  
+  //----------------------------------------------------------------
+  // ctr_reg setting
+  //----------------------------------------------------------------
+  always @ (posedge CLK)
+  begin : ctr_reg_setting
+    integer i;
+    if (HASH_EN == 0) begin
+	   for (i = 0; i < NUM_THREADS; i = i + 1) begin
+		  ctr_reg[i] <= 0;
+		end
+	 end else begin
+	 
+	   if (input_qr_diags && qr_ctr_reg[0] == 4) begin
+		
+		  ctr_reg[input_qr_ptr] <= ctr_reg[input_qr_ptr] + 1'b1;
+		
+		end
+		
+		if (ctr_reg_check && ctr_reg[ctr_reg_ptr] == BLAKE256R14_ROUNDS) begin
+		
+		  ctr_reg[ctr_reg_ptr] <= 0;
+		
+		end
+	 
+	 end
+  end
+
+  //----------------------------------------------------------------
+  // ctr_reg check
+  //----------------------------------------------------------------
+  always @ (posedge CLK)
+  begin : ctr_logic
+	 integer i;
+
+	 if (HASH_EN == 0) begin
+	   for (i = 0; i < NUM_THREADS; i = i + 1) begin
+		   m_reg3[i] <= `NONCE_START_VAL + i;
+		   m_reg4[i] <= UPPER_NONCE_START;
+     end
+
+		init_vreg_state 	<= 1;
+      update_vregs 		<= 1;
+      update_vreg_ptr 	<= 0;
+		
+		value_ready_for_test <= 0;
+	 
+	 end else begin
+	   
+		if (ctr_reg_check && ctr_reg[ctr_reg_ptr] == BLAKE256R14_ROUNDS) begin
+
+	     for (i = 6; i < 8; i = i + 1) begin
+	       target_check[224-(i*32)+: 32] <=  MIDSTATE[224-(i*32)+: 32] ^ v_reg[ctr_reg_ptr][i] ^ v_reg[ctr_reg_ptr][i + 8];
+		  end
+
+		  m_save[1] <= m_reg3[ctr_reg_ptr];
+		  m_save[0] <= m_reg4[ctr_reg_ptr];
+
+		  m_reg3[ctr_reg_ptr] <= m_reg3[ctr_reg_ptr] + `NONCE_STRIDE_VAL;
+		  if ((m_reg3[ctr_reg_ptr]+`NONCE_STRIDE_VAL) < m_reg3[ctr_reg_ptr]) begin
+
+		    m_reg4[ctr_reg_ptr] <= m_reg4[ctr_reg_ptr] + 1;
+		  end
+				
+		  value_ready_for_test <= 1;
+	   end
+	   else begin
+	     value_ready_for_test <= 0;
+      end
+		
+		if (init_vreg_state || (ctr_reg_check && ctr_reg[ctr_reg_ptr] == BLAKE256R14_ROUNDS)) begin
+		  update_vregs    	<= 1;
+		  update_vreg_ptr 	<= ctr_reg_ptr;
+		end
+		
+		if (ctr_reg_check == 0) begin
+		
+		  init_vreg_state 	<= 0;
+		  update_vregs    	<= 0;
+		
+		end
+    
+	 end
+  end
+
+  //----------------------------------------------------------------
+  // Superound FSM
+  //----------------------------------------------------------------
+  always @ (posedge CLK)
+  begin : v_logic
+
+    if (HASH_EN == 0)
+    begin
+
+	   qr_ctr_reg[0] <= 0;
+		qr_ctr_reg[1] <= 0;
+		qr_ctr_reg[2] <= 0;
+		qr_ctr_reg[3] <= 0;
+
+		update_mreg			<= 1; // for initial m_reg loading
+		update_mreg_ptr	<= 0; // for initial m_reg loading
+		
+		input_qr_ptr 		<= 0;
+		input_qr_cols 		<= 0;
+		input_qr_diags 	<= 0;
+		
+		output_qr_ptr 		<= 2; // preload for loop rolled
+		output_qr_cols 	<= 0;	// preload for loop rolled
+		output_qr_diags 	<= 1;	// preload for loop rolled
+		
+		ctr_reg_check     <= 1; // for initial v_reg loading
+		ctr_reg_ptr       <= 1; // for initial v_reg loading
+
+	   main_sr_state_reg <= MAIN_SR_STATE1;
+
+    end
+    else begin
+
+      case (main_sr_state_reg)
+
+        MAIN_SR_STATE1: begin
+
+			 // next state, thread, q_step_base, input_cols, input_diags, mreg_update, mreg_ptr, qr_output_ptr, output_cols, output_diags, ctr_check, ctr_ptr
+          `NEXT_STATE(MAIN_SR_STATE2, 0, 0, 	 1, 			 0, 		     1, 			   1, 		 3, 			    0, 			  1, 			    1, 		   2)
+        end
+
+        MAIN_SR_STATE2: begin
+
+          // next state, thread, q_step_base, input_cols, input_diags, mreg_update, mreg_ptr, qr_output_ptr, output_cols, output_diags, ctr_check, ctr_ptr
+			 `NEXT_STATE(MAIN_SR_STATE3, 1, 0, 	 1, 			 0, 		     1, 			   2, 		 4, 			    0, 			  1, 			    1, 		   3)
+        end
+
+        MAIN_SR_STATE3: begin
+
+			 // next state, thread, q_step_base, input_cols, input_diags, mreg_update, mreg_ptr, qr_output_ptr, output_cols, output_diags, ctr_check, ctr_ptr
+			 `NEXT_STATE(MAIN_SR_STATE4, 2, 0, 	 1, 			 0, 		     1, 			   3, 		 5, 			    0, 			  1, 			    1, 		   4)
+        end
+
+        MAIN_SR_STATE4: begin
+
+			 // next state, thread, q_step_base, input_cols, input_diags, mreg_update, mreg_ptr, qr_output_ptr, output_cols, output_diags, ctr_check, ctr_ptr
+			 `NEXT_STATE(MAIN_SR_STATE5, 3, 0, 	 1, 			 0, 		     1, 			   4, 		 0, 			    0, 			  0, 			    1, 		   5)
+        end
+
+        MAIN_SR_STATE5: begin
+
+			 // next state, thread, q_step_base, input_cols, input_diags, mreg_update, mreg_ptr, qr_output_ptr, output_cols, output_diags, ctr_check, ctr_ptr
+			 `NEXT_STATE(MAIN_SR_STATE6, 4, 0, 	 1, 			 0, 		     1, 			   5, 		 0, 			    0, 			  0, 			    0, 		   0)
+        end
+
+        MAIN_SR_STATE6: begin
+
+			 // next state, thread, q_step_base, input_cols, input_diags, mreg_update, mreg_ptr, qr_output_ptr, output_cols, output_diags, ctr_check, ctr_ptr
+			 `NEXT_STATE(MAIN_SR_STATE7, 5, 0, 	 1, 			 0, 		     1, 			   0, 		 0, 			    1, 			  0, 			    0, 		   0)
+        end
+
+        MAIN_SR_STATE7: begin
+
+			 // next state, thread, q_step_base, input_cols, input_diags, mreg_update, mreg_ptr, qr_output_ptr, output_cols, output_diags, ctr_check, ctr_ptr
+			 `NEXT_STATE(MAIN_SR_STATE8, 0, 4, 	 0, 			 1, 		     1, 			   1, 		 1, 			    1, 			  0, 			    0, 		   0)
+        end
+
+        MAIN_SR_STATE8: begin
+
+			 // next state, thread, q_step_base, input_cols, input_diags, mreg_update, mreg_ptr, qr_output_ptr, output_cols, output_diags, ctr_check, ctr_ptr
+			 `NEXT_STATE(MAIN_SR_STATE9, 1, 4, 	 0, 			 1, 		     1, 			   2, 		 2, 			    1, 			  0, 			    0, 		   0)
+        end
+
+        MAIN_SR_STATE9: begin
+
+			 // next state, thread, q_step_base, input_cols, input_diags, mreg_update, mreg_ptr, qr_output_ptr, output_cols, output_diags, ctr_check, ctr_ptr
+			 `NEXT_STATE(MAIN_SR_STATE10, 2, 4,  0, 			 1, 		     1, 			   3, 		 3, 			    1, 			  0, 			    0, 		   0)
+        end
+
+        MAIN_SR_STATE10: begin
+
+			 // next state, thread, q_step_base, input_cols, input_diags, mreg_update, mreg_ptr, qr_output_ptr, output_cols, output_diags, ctr_check, ctr_ptr
+			 `NEXT_STATE(MAIN_SR_STATE11, 3, 4,  0, 			 1, 		     1, 			   4, 		 4, 			    1, 			  0, 			    0, 		   0)
+        end
+
+        MAIN_SR_STATE11: begin
+
+			 // next state, thread, q_step_base, input_cols, input_diags, mreg_update, mreg_ptr, qr_output_ptr, output_cols, output_diags, ctr_check, ctr_ptr
+			 `NEXT_STATE(MAIN_SR_STATE12, 4, 4,  0, 			 1, 		     1, 			   5, 		 5, 			    1, 			  0, 			    0, 		   0)
+        end
+
+        MAIN_SR_STATE12: begin
+		  
+			 // next state, thread, q_step_base, input_cols, input_diags, mreg_update, mreg_ptr, qr_output_ptr, output_cols, output_diags, ctr_check, ctr_ptr
+			 `NEXT_STATE(MAIN_SR_STATE13, 5, 4,  0, 			 1, 		     0, 			   0, 		 0, 			    0, 			  1, 			    0, 		   0)
+        end
+
+        MAIN_SR_STATE13: begin
+
+			 // next state, thread, q_step_base, input_cols, input_diags, mreg_update, mreg_ptr, qr_output_ptr, output_cols, output_diags, ctr_check, ctr_ptr
+			 `NEXT_STATE(MAIN_SR_STATE14, 0, 0,  0, 			 0, 		     0, 			   0, 		 1, 			    0, 			  1, 			    1, 		   0)
+        end
+
+        MAIN_SR_STATE14: begin
+
+			 // next state, thread, q_step_base, input_cols, input_diags, mreg_update, mreg_ptr, qr_output_ptr, output_cols, output_diags, ctr_check, ctr_ptr
+			 `NEXT_STATE(MAIN_SR_STATE1, 0, 0,   0, 			 0, 		     1, 			   0, 		 2, 			    0, 			  1, 			    1, 		   1)
+        end
+
+      endcase
+    end
+  end
+
+  //----------------------------------------------------------------
+  // Solution qualifier
+  //----------------------------------------------------------------
+  reg           int_internal_reg;
+  reg [63  : 0] nonce_enonce_out_reg;
+
+  wire targetCheckSub;
+  wire targetMaskSub;
+    
+  assign targetCheckSub  =  (target_check[31  :  0] == 32'h0);
+  assign targetMaskSub  =  ((target_check[63 : 32] & TARGET_MASK[31 :  0]) == 32'h0);
+  
+  always @ (posedge CLK)
+  begin
+    if (value_ready_for_test && targetCheckSub && targetMaskSub)
+    begin
+      int_internal_reg <= 1;
+      nonce_enonce_out_reg <= {m_save[1],m_save[0]};
+    end else
+    begin
+      int_internal_reg <= 0;
+    end
+  end
+
+  //----------------------------------------------------------------
+  // inbound reg block
+  //----------------------------------------------------------------
+`ifdef USE_REG_WRITE_TO_HASHMACRO
+  // MIDSTATE    = 31:0   : 0x00 - 0x1F
+  // TARGET_MASK = 35:32  : 0x20 - 0x23
+  // HEADERDATA  = 51:36  : 0x24 - 0x33
+  // EXTRANONCE  = 55:52  : 0x34 - 0x37
+  // NONCE_START = 57:56  : 0x39 - 0x38
+  // STRIDE      = 59:58  : 0x3B - 0x3A
+`ifdef USE_VARIABLE_NONCE_OFFSET
+ `ifdef FULL_CHIP_SIM
+  reg  [7:0] registers [60:0];
+ `else
+  reg  [7:0] registers [59:0];
+ `endif
+`else
+  reg  [7:0] registers [55:0];
+`endif
+  wire [7:0] data_in_bus;
+
+  assign data_in_bus = DATA_IN;
+
+  always @ (posedge CLK)
+  begin
+    if (MACRO_WR_SELECT)
+	 begin
+      registers[ADDR_IN] <= data_in_bus;
+	 end
+  end
+
+  assign MIDSTATE = {registers[0], registers[1], registers[2], registers[3],
+						registers[4], registers[5], registers[6], registers[7],
+						registers[8], registers[9], registers[10],registers[11],
+						registers[12],registers[13],registers[14],registers[15],
+						registers[16],registers[17],registers[18],registers[19],
+						registers[20],registers[21],registers[22],registers[23],
+						registers[24],registers[25],registers[26],registers[27],
+						registers[28],registers[29],registers[30],registers[31]};
+  assign TARGET_MASK = {registers[32],registers[33],registers[34],registers[35]};
+  assign HEADERDATA = {registers[36],registers[37],registers[38],registers[39],
+						registers[40],registers[41],registers[42],registers[43],
+						registers[44],registers[45],registers[46],registers[47],
+						registers[48],registers[49],registers[50],registers[51]};
+  assign ENONCE_IN = {registers[52],registers[53],registers[54],registers[55]};
+
+`endif
+
+  //----------------------------------------------------------------
+  // outbound reg block
+  //----------------------------------------------------------------
+
+  reg solution_ready;
+  assign DATA_AVAILABLE = solution_ready;
+
+  always @ (posedge CLK)
+  begin
+    if ((HASH_EN == 0) || (MACRO_RD_SELECT))
+	 begin
+      solution_ready <= 0;
+	 end else
+    if (int_internal_reg == 1'b1)
+    begin
+	   solution_ready <= 1;
+    end
+  end
+
+  reg				output_enable;
+
+  reg  [7:0]	data_output_bus;
+  assign DATA_OUT = (MACRO_RD_SELECT) ? data_output_bus : 8'bZ;
+
+  always @ (*)
+  begin
+    case (ADDR_IN)
+      0:  data_output_bus <= nonce_enonce_out_reg[7 : 0];
+      1:  data_output_bus <= nonce_enonce_out_reg[15: 8];
+      2:  data_output_bus <= nonce_enonce_out_reg[23:16];
+      3:  data_output_bus <= nonce_enonce_out_reg[31:24];
+      4:  data_output_bus <= nonce_enonce_out_reg[39:32];
+      5:  data_output_bus <= nonce_enonce_out_reg[47:40];
+      6:  data_output_bus <= nonce_enonce_out_reg[55:48];
+      7:  data_output_bus <= nonce_enonce_out_reg[63:56];
+    default: data_output_bus <= 8'h4A;
+    endcase
+  end
+
+endmodule // blake256r14_core
diff --git a/verilog/rtl/decred_top/rtl/src/register_bank.v b/verilog/rtl/decred_top/rtl/src/register_bank.v
new file mode 100755
index 0000000..cd8b105
--- /dev/null
+++ b/verilog/rtl/decred_top/rtl/src/register_bank.v
@@ -0,0 +1,218 @@
+// Copyright 2020 Matt Aamold, James Aamold
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+//     http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+// Language: Verilog 2001
+
+`timescale 1ns / 1ps
+`include "decred_defines.v"
+
+module regBank #(
+  parameter DATA_WIDTH=8,
+  parameter ADDR_WIDTH=8,
+  parameter NUM_OF_MACROS=2
+)(
+  input  wire                  SPI_CLK,
+  input  wire                  RST,
+  input  wire						       M1_CLK,
+  input  wire [ADDR_WIDTH-1:0] address,
+  input  wire [DATA_WIDTH-1:0] data_in,
+  input  wire                  read_strobe,
+  input  wire                  write_strobe,
+  output reg [DATA_WIDTH-1:0]  data_out,
+
+  output wire						       hash_clock_reset,
+  output wire                  LED_out,
+  output wire	[6:0]				     spi_addr,
+  output wire						       ID_out,
+  output wire                  interrupt_out
+  );
+
+  localparam REGISTERS = 6;
+
+  // //////////////////////////////////////////////////////
+  // reg array
+
+  reg  [DATA_WIDTH-1:0] registers [REGISTERS-1:0];
+
+  reg  [7: 0] macro_data_read_rs[1:0];
+  wire [3 :0] threadCount [NUM_OF_MACROS-1:0];
+
+  reg [31:0] perf_counter;
+  always @(posedge M1_CLK)
+    if (registers[3][2] == 1'b1) 
+	    perf_counter <= perf_counter + 1'b1;
+
+  always @(posedge SPI_CLK) begin : REG_WRITE_BLOCK
+    integer i;
+    if(RST) begin 
+      for (i = 0; i < REGISTERS; i = i + 1) begin
+        registers[i] <= 0;
+      end
+    end
+    else begin
+      if (write_strobe) begin
+        registers[address] <= data_in;
+      end
+    end
+  end
+
+  always @(posedge SPI_CLK) begin
+    if (read_strobe) begin
+		if (address[7:0] == 8'h02) begin
+			// interrupt active register
+			data_out <= macro_rs[1];
+		end else
+		  if (address[7:0] == 8'h05) begin
+			// ID register
+			data_out <= 8'h11;
+		end else
+		  if (address[7:0] == 8'h06) begin
+			// MACRO_INFO register
+			data_out <= ((NUM_OF_MACROS << 4) | (threadCount[0]));
+		end else
+		  if (address[7:0] == 8'h07) begin
+			data_out <= perf_counter[7:0];
+		end else
+		  if (address[7:0] == 8'h08) begin
+			data_out <= perf_counter[15:8];
+		end else
+		  if (address[7:0] == 8'h09) begin
+			data_out <= perf_counter[23:16];
+		end else
+		  if (address[7:0] == 8'h0A) begin
+			data_out <= perf_counter[31:24];
+		end else
+      if (address[7] == 0) begin
+        data_out <= registers[address[6:0]];
+      end
+	   else begin
+			data_out <= macro_data_read_rs[1];
+	    end
+    end
+  end
+
+  //  WRITE REGS
+  // MACRO_ADDR  =  0     : 0x00
+  // MACRO_DATA  =  1     : 0x01 (write only)
+  // MACRO_SELECT=  2     : 0x02 (int status on readback)
+  // CONTROL     =  3     : 0x03
+  //   CONTROL.0 = HASHCTRL
+  //   CONTROL.1 = <available>
+  //   CONTROL.2 = PERF_COUNTER run
+  //   CONTROL.3 = LED output
+  //   CONTROL.4 = hash_clock_reset
+  //   CONTROL.5 = ID_out
+  // SPI_ADDR    =  4     : 0x04
+  //   SPI_ADDR.x= Address on SPI bus (6:0)
+  // ID REG      =  5     : 0x05 (read-only)
+  // MACRO_WR_EN =  5     : macro write enable
+  // MACRO_INFO  =  6     : 0x06 macro count (read-only)
+  // PERF_CTR    = 10-7   : 0x0A - 0x07 (read-only)
+
+  assign spi_addr = registers[4][6:0];
+
+  assign LED_out = registers[3][3];
+  assign ID_out = registers[3][5];
+
+  assign hash_clock_reset = registers[3][4];
+
+  // //////////////////////////////////////////////////////
+  // resync - signals to hash_macro 
+
+  reg [1:0] hash_en_rs;
+  wire      HASH_start;
+
+  always @ (posedge M1_CLK)
+  begin
+    hash_en_rs <= {hash_en_rs[0], registers[3][0]};
+  end
+  assign HASH_start = hash_en_rs[1];
+
+  reg		[NUM_OF_MACROS - 1: 0]	wr_select_rs[1:0];
+  always @ (posedge M1_CLK)
+  begin
+    wr_select_rs[1] <= wr_select_rs[0];
+    wr_select_rs[0] <= registers[5][NUM_OF_MACROS - 1: 0];
+  end
+
+  reg		[7: 0]	macro_data_write_rs[1:0];
+  always @ (posedge M1_CLK)
+  begin
+    macro_data_write_rs[1] <= macro_data_write_rs[0];
+    macro_data_write_rs[0] <= registers[1];
+  end
+
+  reg		[NUM_OF_MACROS - 1: 0]	rd_select_rs[1:0];
+  always @ (posedge M1_CLK)
+  begin
+    rd_select_rs[1] <= rd_select_rs[0];
+    rd_select_rs[0] <= registers[2][NUM_OF_MACROS - 1: 0];
+  end
+
+  reg		[5: 0]	macro_addr_rs[1:0];
+  always @ (posedge M1_CLK)
+  begin
+    macro_addr_rs[1] <= macro_addr_rs[0];
+    macro_addr_rs[0] <= registers[0][5:0];
+  end
+
+  // //////////////////////////////////////////////////////
+  // resync - signals from hash_macro 
+
+  wire	[NUM_OF_MACROS - 1: 0]	macro_interrupts;
+  reg		[NUM_OF_MACROS - 1: 0]	macro_rs[1:0];
+
+  always @(posedge SPI_CLK) begin
+    macro_rs[1] <= macro_rs[0];
+    macro_rs[0] <= macro_interrupts;
+  end
+
+  assign interrupt_out = |macro_rs[1];
+
+  wire [7: 0] macro_data_readback;
+
+  always @(posedge SPI_CLK) begin
+    macro_data_read_rs[1] <= macro_data_read_rs[0];
+    macro_data_read_rs[0] <= macro_data_readback;
+  end
+
+  // //////////////////////////////////////////////////////
+  // hash macro interface
+
+  genvar i;
+  for (i = 0; i < NUM_OF_MACROS; i = i + 1) begin: hash_macro_multi_block
+`ifdef USE_NONBLOCKING_HASH_MACRO
+    blake256r14_core_nonblock hash_macro (
+`else
+    blake256r14_core_block hash_macro (
+`endif
+					  
+						.CLK(M1_CLK), 
+						.HASH_EN(HASH_start), 
+
+						.MACRO_WR_SELECT(wr_select_rs[1][i]),
+						.DATA_IN(macro_data_write_rs[1]),
+
+						.MACRO_RD_SELECT(rd_select_rs[1][i]),
+						.ADDR_IN(macro_addr_rs[1]),
+
+						.THREAD_COUNT(threadCount[i]), // one is used == [0]
+
+						.DATA_AVAILABLE(macro_interrupts[i]),
+						.DATA_OUT(macro_data_readback)
+					  );
+  end
+
+endmodule // regBank
+
diff --git a/verilog/rtl/decred_top/rtl/src/spi_des.v b/verilog/rtl/decred_top/rtl/src/spi_des.v
new file mode 100755
index 0000000..68b11c9
--- /dev/null
+++ b/verilog/rtl/decred_top/rtl/src/spi_des.v
@@ -0,0 +1,159 @@
+// Copyright 2020 Matt Aamold, James Aamold
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+//     http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+// Language: Verilog 2001
+
+`timescale 1ns / 1ps
+
+module spi (
+  input  wire  SPI_CLK,
+  input  wire  RST,
+  input  wire  SCLK,
+  input  wire  SCSN,
+  input  wire  MOSI,
+
+  output reg       start_of_transfer,
+  output reg       end_of_transfer,
+  output reg [7:0] mosi_data_out,
+  output reg       mosi_data_ready,
+  output reg       MISO,
+  output reg       miso_data_request,
+  input  [7:0]     miso_data_in
+  );
+
+  // //////////////////////////////////////////////////////
+  // synchronizers
+  reg [1:0] scsn_resync;
+  reg [1:0] sclk_resync;
+  reg [1:0] mosi_resync;
+
+  reg [1:0] scsn_edge;
+  reg [1:0] sclk_edge;
+
+  wire scsn_rs;
+  wire mosi_rs;
+  reg  rising_sclk;
+  reg  falling_sclk;
+
+  always @(posedge SPI_CLK)
+    if (RST) begin
+      sclk_resync <= 0;
+      sclk_edge   <= 'h0;
+      scsn_resync <= 'h3;
+      scsn_edge   <= 'h3;
+      mosi_resync <= 0;
+    end
+    else begin
+      scsn_resync <= {scsn_resync[0], SCSN};
+      scsn_edge   <= {scsn_edge[0], scsn_resync[1]};
+      sclk_resync <= {sclk_resync[0], SCLK};
+      sclk_edge   <= {sclk_edge[0], sclk_resync[1]};
+      mosi_resync <= {mosi_resync[0], MOSI};
+    end
+
+  assign scsn_rs = scsn_resync[1];
+  assign mosi_rs = mosi_resync[1];
+  
+  always @(posedge SPI_CLK)
+    if (RST) begin
+      rising_sclk  <= 0;
+	  falling_sclk <= 0;
+      start_of_transfer <= 0;
+      end_of_transfer <= 0;
+	end
+	else begin
+	  rising_sclk  <= !sclk_edge[1] & sclk_edge[0] & !scsn_rs;
+      falling_sclk <= sclk_edge[1] & !sclk_edge[0] & !scsn_rs; 
+      start_of_transfer <= scsn_edge[1] & !scsn_edge[0];
+      end_of_transfer <= !scsn_edge[1] & scsn_edge[0];
+	end
+
+  // //////////////////////////////////////////////////////
+  // strobes
+
+  reg [2:0] bitcount;
+  reg       byteCountStrobe;
+
+  always @(posedge SPI_CLK)
+    if (RST) begin
+      bitcount <= 0;
+      byteCountStrobe <= 0;
+    end
+	else if (start_of_transfer) begin
+      bitcount <= 0;
+      byteCountStrobe <= 0;
+    end
+    else if (falling_sclk) begin
+      bitcount <= bitcount + 1'b1;
+      byteCountStrobe <= (bitcount == 'h7);
+    end
+    else if (byteCountStrobe | scsn_rs)
+      byteCountStrobe <= 0;
+
+  // //////////////////////////////////////////////////////
+  // MOSI snapshot register and output
+
+  reg [7:0] mosi_data_shift_reg;
+
+  always @(posedge SPI_CLK)
+    if (RST) begin
+      mosi_data_shift_reg <= 0;
+    end
+    else if (rising_sclk) begin
+      mosi_data_shift_reg <= {mosi_data_shift_reg[6:0], mosi_rs};
+    end
+
+  always @(posedge SPI_CLK)
+    if (RST) begin
+      mosi_data_out <= 0;
+    end
+    else if (byteCountStrobe) begin
+      mosi_data_out <= mosi_data_shift_reg;
+    end
+
+  always @(posedge SPI_CLK)
+    if (RST)
+      mosi_data_ready <= 0;
+    else
+      mosi_data_ready <= byteCountStrobe;
+
+  // //////////////////////////////////////////////////////
+  // MISO input capture and presentation to host
+
+  reg [7:0] miso_data_shift_reg;
+
+  always @(posedge SPI_CLK)
+    if (RST)
+      miso_data_request <= 0;
+    else
+      miso_data_request <= byteCountStrobe;
+
+  always @(posedge SPI_CLK)
+    if (RST) begin
+      miso_data_shift_reg <= 0;
+	end
+    else if (miso_data_request) begin
+      miso_data_shift_reg <= miso_data_in;
+	end
+    else if (falling_sclk)
+      miso_data_shift_reg <= {miso_data_shift_reg[6:0], 1'b0};
+
+  always @(posedge SPI_CLK)
+    if (RST)
+      MISO <= 0;
+    else
+      MISO <= miso_data_shift_reg[7];
+
+
+endmodule // spi
diff --git a/verilog/rtl/decred_top/rtl/src/spi_passthrough.v b/verilog/rtl/decred_top/rtl/src/spi_passthrough.v
new file mode 100755
index 0000000..556f3a1
--- /dev/null
+++ b/verilog/rtl/decred_top/rtl/src/spi_passthrough.v
@@ -0,0 +1,150 @@
+// Copyright 2020 Matt Aamold, James Aamold
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+//     http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+// Language: Verilog 2001
+
+`timescale 1ns / 1ps
+
+`include "decred_defines.v"
+
+module spi_passthrough (
+  input  wire  SPI_CLK,
+  input  wire  RSTin,
+  input  wire	 ID_in, 
+  input	 wire	 IRQ_in,
+  input  wire  address_strobe,
+  input  wire [6:0] currentSPIAddr,
+  input  wire [6:0] setSPIAddr,
+
+  input  wire  SCLKin,
+  input  wire  SCSNin,
+  input  wire  MOSIin,
+  output wire  MISOout,
+
+  output wire  rst_local,
+  output wire  sclk_local,
+  output wire	 scsn_local,
+  output wire	 mosi_local,
+  input	 wire	 miso_local,
+  input	 wire	 irq_local,
+  output wire  write_enable,
+
+  output wire  RSTout,
+  output wire  SCLKout,
+  output wire  SCSNout,
+  output wire  MOSIout,
+  input  wire  MISOin,
+  output wire  IRQout
+  );
+
+  // each of the inputs are negated
+  wire rst_wire;
+
+  // //////////////////////////////////////////////////////
+  // synchronizers
+  reg [1:0] id_resync;
+  reg [1:0] reset_resync;
+
+  always @(posedge SPI_CLK)
+    if (rst_wire) begin
+      id_resync <= 0;
+    end
+    else begin
+      id_resync <= {id_resync[0], ID_in};
+    end
+
+  reg [1:0] irq_resync;
+
+  always @(posedge SPI_CLK)
+    if (rst_wire) begin
+      irq_resync <= 0;
+    end
+    else begin
+      irq_resync <= {irq_resync[0], IRQ_in};
+    end
+
+  assign IRQout = irq_resync[1] | irq_local;
+
+  always @(posedge SPI_CLK)
+    begin
+      reset_resync <= {reset_resync[0], !RSTin};
+    end
+
+  // //////////////////////////////////////////////////////
+  // pass-through signals and pick-off
+
+  assign rst_wire = reset_resync[1];
+  assign rst_local = rst_wire;
+  assign RSTout = RSTin;
+
+  assign SCLKout = SCLKin;
+  assign sclk_local = SCLKin;
+
+  assign SCSNout = SCSNin;
+  assign scsn_local = SCSNin;
+
+  assign MOSIout = MOSIin;
+  assign mosi_local = MOSIin;
+
+  // //////////////////////////////////////////////////////
+  // MISO mux
+
+  wire unique_address_match;
+  wire id_active;
+  reg  local_address_select;
+
+  assign unique_address_match = (currentSPIAddr == setSPIAddr) ? 1'b1 : 1'b0;
+  assign id_active = id_resync[1];
+
+  always @(posedge SPI_CLK)
+    if (rst_wire) begin
+      local_address_select <= 0;
+    end
+    else begin
+      if (address_strobe) begin
+        if ((id_active) && (unique_address_match)) begin
+          local_address_select <= 1;
+        end else begin
+          local_address_select <= 0;
+        end
+      end
+    end
+
+  assign MISOout = (local_address_select) ? miso_local : MISOin;
+
+  // //////////////////////////////////////////////////////
+  // Write enable mask
+
+  wire global_address_match;
+  reg  global_address_select;
+
+  assign global_address_match = (currentSPIAddr == 7'b1111111) ? 1'b1 : 1'b0;
+
+  always @(posedge SPI_CLK)
+    if (rst_wire) begin
+      global_address_select <= 0;
+    end
+    else begin
+      if (address_strobe) begin
+        if ((id_active) && ((unique_address_match) || (global_address_match))) begin
+          global_address_select <= 1;
+        end else begin
+          global_address_select <= 0;
+        end
+      end
+    end
+
+  assign write_enable = global_address_select;
+
+endmodule // spi_passthrough
diff --git a/verilog/rtl/user_project_wrapper.v b/verilog/rtl/user_project_wrapper.v
index 0b23a50..30de619 100644
--- a/verilog/rtl/user_project_wrapper.v
+++ b/verilog/rtl/user_project_wrapper.v
@@ -61,10 +61,10 @@
 );
 
     /*--------------------------------------*/
-    /* User project is instantiated  here   */
+    /* Instantiation of decred_top.         */
     /*--------------------------------------*/
 
-    user_proj_example mprj (
+    decred_top mprj (
     `ifdef USE_POWER_PINS
 	.vdda1(vdda1),	// User area 1 3.3V power
 	.vdda2(vdda2),	// User area 2 3.3V power
@@ -75,34 +75,31 @@
 	.vssd1(vssd1),	// User area 1 digital ground
 	.vssd2(vssd2),	// User area 2 digital ground
     `endif
+	// inputs
+	.PLL_INPUT(user_clock2),
+	.EXT_RESET_N_fromHost(analog_io[0]),
+	.SCLK_fromHost(analog_io[1]),
+	.M1_CLK_IN(analog_io[2]),
+	.M1_CLK_SELECT(analog_io[3]),
+	.S1_CLK_IN(analog_io[4]),
+	.S1_CLK_SELECT(analog_io[5]),
+	.SCSN_fromHost(analog_io[6]),
+	.MOSI_fromHost(analog_io[7]),
+	.MISO_fromClient(analog_io[8]),
+	.IRQ_OUT_fromClient(analog_io[9]),
+	.ID_fromClient(analog_io[10]),
+	.SPI_CLK_RESET_N(analog_io[11]),
 
-	// MGMT core clock and reset
-
-    	.wb_clk_i(wb_clk_i),
-    	.wb_rst_i(wb_rst_i),
-
-	// MGMT SoC Wishbone Slave
-
-	.wbs_cyc_i(wbs_cyc_i),
-	.wbs_stb_i(wbs_stb_i),
-	.wbs_we_i(wbs_we_i),
-	.wbs_sel_i(wbs_sel_i),
-	.wbs_adr_i(wbs_adr_i),
-	.wbs_dat_i(wbs_dat_i),
-	.wbs_ack_o(wbs_ack_o),
-	.wbs_dat_o(wbs_dat_o),
-
-	// Logic Analyzer
-
-	.la_data_in(la_data_in),
-	.la_data_out(la_data_out),
-	.la_oen (la_oen),
-
-	// IO Pads
-
-	.io_in (io_in),
-    	.io_out(io_out),
-    	.io_oeb(io_oeb)
+	// outputs
+	.SCSN_toClient(analog_io[12]),
+	.SCLK_toClient(analog_io[13]),
+	.MOSI_toClient(analog_io[14]),
+	.EXT_RESET_N_toClient(analog_io[15]),
+	.ID_toHost(analog_io[16]),
+	.CLK_LED(analog_io[17]),
+	.MISO_toHost(analog_io[18]),
+	.HASH_LED(analog_io[19]),
+	.IRQ_OUT_toHost(analog_io[20])
     );
 
 endmodule	// user_project_wrapper