README.md 7.33 KB
Newer Older
Long Chung Chan's avatar
Long Chung Chan committed
1 2
# Partitioning FPGA-Optimized Systolic Arrays

LongChan's avatar
LongChan committed
3
We provide a fast optimization algorithm and a step-to-step guide on how to generate the dataset for a specific board and topologies to be used by our optimization tool.
Long Chung Chan's avatar
Long Chung Chan committed
4 5

>    *Long Chung Chan, Gurshaant Singh Malik and Nachiket Kapre*
Harry Chan's avatar
Harry Chan committed
6

LongChan's avatar
LongChan committed
7
>    [**"Partitioning FPGA-Optimized Systolic Arrays for Fun and Profit"**](optimization_algo/paper/PID6211513.pdf)
Harry Chan's avatar
Harry Chan committed
8

LongChan's avatar
LongChan committed
9
>    2019 International Conference on Field-Programmable Technology
Long Chung Chan's avatar
Long Chung Chan committed
10

Harry Chan's avatar
Harry Chan committed
11
## Demo
LongChan's avatar
LongChan committed
12
The following demos use pre-generated datasets and topologies that can be found in:
Long Chung Chan's avatar
Long Chung Chan committed
13

LongChan's avatar
LongChan committed
14 15
- [topologies](optimization_algo/topologies/) contains all the topologies descriping their respective CNN structures
- [data_source](optimization_algo/data_source/) contains all the cycle-accurate data generated using [SCALE sim](https://github.com/ARM-software/SCALE-Sim)
LongChan's avatar
LongChan committed
16 17 18 19 20 21 22 23 24 25 26

The instruction below will do a sweep run on each of the following networks:
   - FasterRCNN
   - Mobilenet
   - Yolo tiny
   - Googlenet
   - Alexnet
   - AlphaGoZero
   - NCF_rec
   - Resnet_50_v1

LongChan's avatar
LongChan committed
27
To obtain individual optimization result for a specific network and a specfic number of partition, please refer to the section below.
LongChan's avatar
LongChan committed
28

Harry Chan's avatar
Harry Chan committed
29
To get optimization result with:
LongChan's avatar
LongChan committed
30
1. Covariance Matrix Adaptation Evolution Strategy (CMA-es)
LongChan's avatar
LongChan committed
31
    ```bash
32 33
        cd optimization_algo/scripts
        ./sweep_nets_cma.sh
LongChan's avatar
LongChan committed
34 35
    ```

LongChan's avatar
LongChan committed
36
2. Genetic Algorithm (GA)
LongChan's avatar
LongChan committed
37
    ```bash
38 39
        cd optimization_algo/scripts
        ./sweep_nets_ga.sh
LongChan's avatar
LongChan committed
40 41
    ```

LongChan's avatar
LongChan committed
42
3. Hyperparameter Optimiztion
LongChan's avatar
LongChan committed
43
    ```bash
44 45
        cd optimization_algo/scripts
        ./sweep_nets_ho.sh
LongChan's avatar
LongChan committed
46 47
    ```

LongChan's avatar
LongChan committed
48
4. Brute Force
LongChan's avatar
LongChan committed
49
    ```bash
50 51
        cd optimization_algo/scripts
        ./sweep_nets_brute.sh
LongChan's avatar
LongChan committed
52
    ```
Long Chung Chan's avatar
Long Chung Chan committed
53

LongChan's avatar
LongChan committed
54
Result of the optimization will be added to the corresponding csv file under this [folder](optimization_algo/resulting_csv).
Harry Chan's avatar
Harry Chan committed
55

Harry Chan's avatar
Harry Chan committed
56
## Step-by-step detail guide
LongChan's avatar
LongChan committed
57

58 59 60 61 62 63 64 65 66 67
### 1. Custom topologies
SCALE-sim requires a `.csv` file containing the following attribut for each layer in the network:
   1. Layer name
   2. IFMAP Height
   3. IFMAP Width
   4. Filter Height
   5. Filter Width
   6. Channels
   7. Number of Filters
   8. Strides
Harry Chan's avatar
Harry Chan committed
68

LongChan's avatar
LongChan committed
69
Examples can be found under [topologies](optimization_algo/topologies/). If you have problem figuring out the correct topology file of a specific network, you can check out the [Netscope CNN Analyzer](https://dgschwend.github.io/netscope/quickstart.html).
Harry Chan's avatar
Harry Chan committed
70

71
### 2. Custom hardware model
LongChan's avatar
LongChan committed
72
SCALE-sim also requires a config file containing the description of your hardware model. The config file used to generate all the data in the paper is [`US_sim.cfg`](master/scaleSim/configs/US_sim.cfg). Please refer to the [SCALE sim](https://github.com/ARM-software/SCALE-Sim) for more detail on how to create your own topology file.
Long Chung Chan's avatar
Long Chung Chan committed
73

74 75 76 77 78 79 80 81 82 83 84 85 86 87
### 3. Generate data source using SCALE-sim
A small modification is done on SCALE-sim to:
1. Enable multi-processing to obtain a faster generation speed on the dataset
2. Sweep through every layer with increment resource

For this reason, we have create another bash script so you don't have to worry about running SCALE-sim by yourself. 

For example, to obtain the dataset for `US_sim.cfg` with `Alexnet`:

```bash
    cd scaleSim
    ./generate_data_set.sh configs/US_sim.cfg ../topologies/960_DNN/Alexnet.csv 
```

LongChan's avatar
LongChan committed
88
The default value for the number of processes in parallel is `6`. This can be changed in line 257 of [scale.py](scaleSim/scale.py). However, SCALE-sim creates temporary csv files for caching purpose, please be careful on adjusting the number to avoid `DiskOutOfSpace` error.
89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111

```python
    ...
    pool = Pool(processes = 6) # RIGHT HERE !!!
    for pro in pool.imap_unordered(self.run_mp_once, all_arr_dim_list):
        self.run_name = net_name + "_" + self.dataflow + "_" + str(pro[0]) + "x" + str(pro[1])
        self.cleanup(pro)
    pool.close()
    ...
```

After all the data are generated, all the data are spread into different files under the `outputs` directory. Run the following script to repack them into one csv file:

```bash
    cd scaleSim
    ./generate_final_csv.sh Alexnet ../optimization_algo/data_source/alexnet_mem_bound.csv 10 
```

Here are the assumption for the file name:
1. The csv file containing cycle accurate data generated from SCALE-sim: `{topology name}_mem_bound.csv`
2. The csv file containing the topology of the CNN: `{topology name}.csv`

### 4. Running script targeting specific approach
LongChan's avatar
LongChan committed
112
Under the `optimization_algo/scripts` directory
113
1. Covariance Matrix Adaptation Evolution Strategy (CMA-es)
LongChan's avatar
LongChan committed
114
    > Please uncomment the line 349 - 364 in `cma_approach.py` to see the output
115
    ```bash
LongChan's avatar
LongChan committed
116 117
        # python3 ../approaches/cma_approach.py ${network name} ${number of partitions} ${population size} ${resource unit available} ${strategy} ${optimization target}
        python3 ../approaches/cma_approach.py alexnet 3 100 960 allzeros DRAM_cycle
118
    ```
LongChan's avatar
LongChan committed
119
    Result Screenshoot: ![](screenshots/cma_result.png)
120 121
    
2. Genetic Algorithm (GA)
LongChan's avatar
LongChan committed
122
    > Please uncomment the line 287 - 299 in `ga_approach.py` to see the output
123
    ```bash
LongChan's avatar
LongChan committed
124 125
        # python3 ../approaches/ga_approach.py ${network name} ${number of partitions} ${elite population size} ${population size} ${resource unit available} ${optimization target}
        python3 ../approaches/ga_approach.py alexnet 3 10 100 960 DRAM_cycle
126
    ```
LongChan's avatar
LongChan committed
127
    Result Screenshoot: ![](screenshots/ga_result.png)
128 129 130

3. Hyperparameter Optimiztion
    ```bash
LongChan's avatar
LongChan committed
131 132
        # python3 ../approaches/hyper_parameter_ga.py ${network name} ${number of partitions} ${resource unit available} ${target} ${max iteration}
        python3 ../approaches/hyper_parameter_ga.py alexnet 3 960 DRAM_cycle 2500
133
    ```
LongChan's avatar
LongChan committed
134
    Result Screenshoot: ![](screenshots/ho_result.png)
135 136 137

4. Brute Force
    ```bash
LongChan's avatar
LongChan committed
138 139
        # python3 ../approaches/brute_force_approach.py ${network name} ${number of partitions} ${resource unit available} ${target}
        python3 ../approaches/brute_force_approach.py alexnet 3 960 DRAM_cycle
140
    ```
LongChan's avatar
LongChan committed
141
    Result Screenshoot: ![](screenshots/brute_result.png)
Long Chung Chan's avatar
Long Chung Chan committed
142

LongChan's avatar
LongChan committed
143
Result of the optimization are also added to the corresponding csv file under this [folder](https://git.uwaterloo.ca/watcag-public/fpga-syspart/blob/master/optimization_algo/resulting_csv).
Long Chung Chan's avatar
Long Chung Chan committed
144

LongChan's avatar
LongChan committed
145
<!-- ## Repo Breakdown -->
Long Chung Chan's avatar
Long Chung Chan committed
146

LongChan's avatar
LongChan committed
147
## License
Long Chung Chan's avatar
Long Chung Chan committed
148
This tool is distributed under MIT license.
LongChan's avatar
LongChan committed
149
Copyright (c) 2019 Long Chung Chan, Gurshaant Singh Malik, Nachiket Kapre
Long Chung Chan's avatar
Long Chung Chan committed
150 151 152 153 154 155 156 157 158 159 160 161 162

<div style="text-align: justify;"> 
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
<br><br>
</div>

<div style="text-align: justify;"> 
<b>The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.</b>
<br><br>
</div>

<div style="text-align: justify;"> 
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
Harry Chan's avatar
Harry Chan committed
163
</div>