Conversation
| # Parse arguments | ||
| MODE="check" | ||
| if [[ $# -gt 0 ]]; then | ||
| case "$1" in | ||
| --baseline) | ||
| MODE="baseline" | ||
| ;; | ||
| --check) | ||
| MODE="check" | ||
| ;; | ||
| *) | ||
| usage | ||
| ;; | ||
| esac | ||
| fi |
There was a problem hiding this comment.
Feels weird to combine the check and validate in here to deduplicate the 3 example calls, and not do the same for the benchmark. I'd merge all three.
| @@ -0,0 +1,131 @@ | |||
| #!/bin/bash | |||
| IFS=':' read -r input train whole name <<< "$example" | ||
| echo " Processing $name..." | ||
| run_example "$input" "$train" "$whole" "$BASELINE_DIR/$name" |
There was a problem hiding this comment.
Rather than putting the example in a string array, splitting and naming them, then naming them again in the run method, I'd rather write three methods
NC454() { "$BINARY" -s example/NC_000913-454.fna -t 454_10 -w 0 -o NC_000913-454; }
...
examples=(NC454 ...)
And loop through the methods to call them directly.
| BINARY="$PROJECT_ROOT/target/release/FragGeneScanRs" | ||
|
|
||
| # Check for hyperfine | ||
| if ! command -v hyperfine &> /dev/null; then |
There was a problem hiding this comment.
Just use ! hyperfine -V, no need for command.
| echo " Warning: Baseline file $baseline_file not found" | ||
| continue |
There was a problem hiding this comment.
I'd rather be defensive and have this fail if there is no baseline to be found.
|
I'll stop putting any effort into these branches. I'm sure there are large performance gains possible, but they are too dependant on CPU architecture to reliably benchmark. Every optimization I made on apple silicon had the opposite effect on @ninewise his machine. On x86, the biggest gains are in restructuring memory and operations to make use of additional vectorisation and AVX512 instructions, but these are not available on Apple Silicon. In addition, memory pressure can be reduced by reusing alpha and path instead of allocation memory each time. On apple silicon hower, this is slower because it is extremely fast in allocating zeroed memory by use of "zero pages". |
This PR prepares for a set of potential performance improvements by adding a validation and benchmark script. Both scripts run on the 3 files in the examples directory.
Performance baseline on my macbook:
1.000×
1.000×
1.000×
1.235×
1.396×
1.475×
1.459×
1.550×
1.681×
1.492×
1.513×
1.684×
1.671×
1.567×
1.714×
Other lessons learned