Commit 8a2b42e0 authored by LongChan's avatar LongChan
Browse files

added detail steps on generate dataset

parent 75008919
......@@ -16,7 +16,6 @@ We provide a fast optimization algorithm and a step-to-step guide on how to gene
- [] running the optimzation algo on the generated dataset
- [] What each files/directories are responsible for
- [] Changes done to ScaleSim
- [] Explanation of the testing result
## Demo
The following demos use pre-generated datasets and topologies that can be found in:
......@@ -39,39 +38,108 @@ To obtain individual optimization result for a specific network and a specfic nu
To get optimization result with:
1. Covariance Matrix Adaptation Evolution Strategy (CMA-es)
```bash
# cd optimization_algo/scripts
# ./sweep_nets_cma.sh
cd optimization_algo/scripts
./sweep_nets_cma.sh
```
2. Genetic Algorithm (GA)
```bash
# cd optimization_algo/scripts
# ./sweep_nets_ga.sh
cd optimization_algo/scripts
./sweep_nets_ga.sh
```
3. Hyperparameter Optimiztion
```bash
# cd optimization_algo/scripts
# ./sweep_nets_ho.sh
cd optimization_algo/scripts
./sweep_nets_ho.sh
```
4. Brute Force
```bash
# cd optimization_algo/scripts
# ./sweep_nets_brute.sh
cd optimization_algo/scripts
./sweep_nets_brute.sh
```
Result of the optimization will be added to the corresponding csv file under this [folder](https://git.uwaterloo.ca/watcag-public/fpga-syspart/blob/master/optimization_algo/resulting_csv)
Result of the optimization will be added to the corresponding csv file under this [folder](https://git.uwaterloo.ca/watcag-public/fpga-syspart/blob/master/optimization_algo/resulting_csv).
## Step-by-step detail guide
### Custom topologies
### 1. Custom topologies
SCALE-sim requires a `.csv` file containing the following attribut for each layer in the network:
1. Layer name
2. IFMAP Height
3. IFMAP Width
4. Filter Height
5. Filter Width
6. Channels
7. Number of Filters
8. Strides
### Custom target board
Examples can be found under [topologies](https://git.uwaterloo.ca/watcag-public/fpga-syspart/blob/master/optimization_algo/topologies/). If you have problem figuring out the correct topology file of a specific network, you can check out the [Netscope CNN Analyzer](https://dgschwend.github.io/netscope/quickstart.html).
### Generate data source using SCALE-sim
### 2. Custom hardware model
SCALE-sim also requires a config file containing the description of your hardware model. The config file used to generate all the data in the paper is [`US_sim.cfg`](https://git.uwaterloo.ca/watcag-public/fpga-syspart/blob/master/scaleSim/configs/US_sim.cfg). Please refer to the [SCALE sim](https://github.com/ARM-software/SCALE-Sim) for more detail on how to create your own topology file.
### Running script targeting specific approach
### 3. Generate data source using SCALE-sim
A small modification is done on SCALE-sim to:
1. Enable multi-processing to obtain a faster generation speed on the dataset
2. Sweep through every layer with increment resource
For this reason, we have create another bash script so you don't have to worry about running SCALE-sim by yourself.
For example, to obtain the dataset for `US_sim.cfg` with `Alexnet`:
```bash
cd scaleSim
./generate_data_set.sh configs/US_sim.cfg ../topologies/960_DNN/Alexnet.csv
```
The default value for the number of processes in parallel is `6`. This can be changed in line 257 of [scale.py](https://git.uwaterloo.ca/watcag-public/fpga-syspart/blob/master/scaleSim/scale.py). However, SCALE-sim creates temporary csv files for caching purpose, please be careful on adjusting the number to avoid `DiskOutOfSpace` error.
```python
...
pool = Pool(processes = 6) # RIGHT HERE !!!
for pro in pool.imap_unordered(self.run_mp_once, all_arr_dim_list):
self.run_name = net_name + "_" + self.dataflow + "_" + str(pro[0]) + "x" + str(pro[1])
self.cleanup(pro)
pool.close()
...
```
After all the data are generated, all the data are spread into different files under the `outputs` directory. Run the following script to repack them into one csv file:
```bash
cd scaleSim
./generate_final_csv.sh Alexnet ../optimization_algo/data_source/alexnet_mem_bound.csv 10
```
Here are the assumption for the file name:
1. The csv file containing cycle accurate data generated from SCALE-sim: `{topology name}_mem_bound.csv`
2. The csv file containing the topology of the CNN: `{topology name}.csv`
### 4. Running script targeting specific approach
1. Covariance Matrix Adaptation Evolution Strategy (CMA-es)
```bash
python3 ../approaches/cma_approach.py ${net} ${partitions} ${popsize} ${res_unit} ${strategy} ${target}
```
2. Genetic Algorithm (GA)
```bash
cd optimization_algo/scripts
./sweep_nets_ga.sh
```
3. Hyperparameter Optimiztion
```bash
cd optimization_algo/scripts
./sweep_nets_ho.sh
```
4. Brute Force
```bash
cd optimization_algo/scripts
./sweep_nets_brute.sh
```
## Repo Breakdown
......
#!/bin/bash
# ./generate_final_csv.sh yolo_tiny partitioning_problem/yolo_tiny_mem_bound.csv 960
# ./generate_final_csv.sh yolo_tiny optimization_algo/yolo_tiny_mem_bound.csv 960
if [ "$1" != "" ]; then
python3 partitioning_problem/csv_reorganizer.py outputs/ $1 $2 $3
python3 csv_reorganizer.py outputs/ $1 $2 $3
else
echo "Positional parameter 1 is empty"
fi
\ No newline at end of file
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment