This is a snippet for benchmarking clojure application which I have used quite often in clojure applications at my job.
First an introduction to the tools:
- clj-asyc-profiler: for flamegraphs and call graph.
- criterion: for statistically correct benchmarking.
- tufte: application level profiling.
from this we can keep both #1 and #2 in separate profile just for benchmarking and #3 must be included in the :default profile, otherwise it would defeat its purpose.
below is the code snippet for having these libraries setup in project:
(defproject clojure-benchmark "0.1.0-SNAPSHOT"
:description "snippets to setup benchamrking tests in clojure"
:url "http://example.com/FIXME"
:license {:name "EPL-2.0 OR GPL-2.0-or-later WITH Classpath-exception-2.0"
:url "https://www.eclipse.org/legal/epl-2.0/"}
:test-selectors {:default (complement :bench)
:not (fn [m s] (not (contains? m (keyword s))))
:bench :bench}
:dependencies [[org.clojure/clojure "1.11.1"]
[com.taoensso/tufte "2.4.5"]]
:profiles {:test [{:dependencies [[org.clojure/test.check "1.1.1"]
[criterium "0.4.6"]]}]
:bench [{:test-paths ["bench"]
:global-vars {*assert* true}
:jvm-opts ["-Djdk.attach.allowAttachSelf"
"-XX:+UnlockDiagnosticVMOptions"
"-XX:+DebugNonSafepoints"]
:dependencies [[com.clojure-goes-fast/clj-async-profiler "1.0.4"]]}
:test]}
:repl-options {:init-ns clojure-benchmark.core})
important points to note are:
- We have created a new source locaction called
bench, this is added so that all of these overhead of benchmarking is not included in production code, and it can be isolated to run during local run or pipeline run. - There are
:test-selectorsbeing added so that we can specifically mark some test to be benchmark tests and run only them usinglein with-profile +bench test :bench.
The overhead mentioned above will be due to including below functionlities in the code:
- Runnig flamgraph profiling for the whole duration of the benchmark run.
- Making sure that
tuftecheck all the namespaces for gathering the application level metrics. - Making sure that all benchmark are being written to a file.
The code for above mentioned functionalities is:
(ns clojure-benchmark.bench
(:require
[clojure.java.io :as io]
[clj-async-profiler.core :as prof]
[taoensso.tufte :as tufte]))
;; necessary to print it so that jvm know that we are using its result
(def prof-started (let [start (atom (prof/start {}))
_ (println "profiling started." (prof/status))]
@start))
(alter-var-root #'tufte/*ns-filter* (constantly "*"))
;; accumulator for tufte result
(def stats-accumulator (tufte/add-accumulating-handler! {:ns-pattern "*"}))
;; before exiting make sure that we are printing profiling result
(System/setSecurityManager
(proxy [SecurityManager] []
(checkExit [status]
(let [_ (println "profiling ended." (prof/status))
_ (io/make-parents "profiling-result/random-file")
result (prof/stop)
file-name (.getName result)]
(io/make-parents (format "profiling-result/%s" file-name))
(io/copy result (io/as-file (format "profiling-result/%s" file-name)))
(with-open [writer (io/writer (format "profiling-result/tufte-prof-%d.txt" (System/currentTimeMillis)))]
(.write writer (tufte/format-grouped-pstats @stats-accumulator)))))
(checkCreateClassLoader [& args]
true)
(checkPermission [& args]
true)))
(ns clojure-benchmark.util
(:require [clojure.java.io :as io]
[clojure.test :refer :all]))
(defn write-to-file [file-name f & args]
(let [file-name (format "profiling-result/%s" file-name)
_ (io/make-parents file-name)]
(with-open [file (io/writer file-name)]
(binding [*out* file]
(apply f args)))))
Please check the sample test to understand how to use write the benchmark
(ns clojure-benchmark.core-test
(:require [clojure.test :refer :all]
[taoensso.tufte :refer :all]
[criterium.core :refer [benchmark report-result]]
[clojure-benchmark.util :refer [write-to-file]]
[clojure-benchmark.core :refer :all]))
(defn fib [n]
(cond (<= n 0) 0
(= n 1) 1
:else (+ (fib (- n 1)) (fib (- n 2)))))
(defn noob-fib [n] (p :noob (fib n)))
(def lazy-fib (lazy-cat [0 1] (map + lazy-fib (rest lazy-fib))))
(defn pro-fib [n]
(p :pro (nth lazy-fib n)))
(deftest a-test
(testing "test"
(is (= (noob-fib 10) 55))
(is (= (noob-fib 10) 55))
(is (= (pro-fib 13) 233))
(is (= (pro-fib 13) 233))))
(deftest ^:bench benchmark-fib
(profile {}
(let [input [10 13]
noob-fib (benchmark (doseq [inp input] (noob-fib inp)) {:verbose true :runtime true})
pro-fib (benchmark (doseq [inp input] (pro-fib inp)) {:verbose true :runtime true})]
(write-to-file "pro-fib.txt" report-result pro-fib :verbose :os :runtime)
(write-to-file "noob-fib.txt" report-result noob-fib :verbose :os :runtime))))
and its output for lein test:
lein test clojure-benchmark.core-test
Ran 1 tests containing 4 assertions.
0 failures, 0 errors.
lein with-profile +bench test :bench
profiling started. Profiling is running for 0 seconds
lein test clojure-benchmark.core-test
Ran 1 tests containing 0 assertions.
0 failures, 0 errors.
profiling ended. Profiling is running for 182 seconds
Profiling started
output of tufte:
:tufte/nil-id,
pId nCalls Min 50% ≤ 90% ≤ 95% ≤ 99% ≤ Max Mean MAD Clock Total
:noob 2,426,660 6.13μs 28.29μs 60.32μs 78.76μs 299.16μs 25.60ms 38.21μs ±76% 1.55m 52%
:pro 34,272,466 396.00ns 786.00ns 1.24μs 2.29μs 3.17μs 917.90ms 1.04μs ±59% 35.66s 20%
:tufte/compaction 45 39.54ms 155.80ms 456.94ms 791.48ms 1.52s 1.52s 219.91ms ±77% 9.90s 6%
Accounted 2.30m 78%
Clock 2.96m 100%
output of benchmark:
[hdggxin@nixos:~/workspace/clojure-benchmark/profiling-result]$ cat noob-fib.txt
amd64 Linux 6.1.51 4 cpu(s)
OpenJDK 64-Bit Server VM 25.362-bga
Runtime arguments: -Dfile.encoding=UTF-8 -Djdk.attach.allowAttachSelf -XX:+UnlockDiagnosticVMOptions -XX:+DebugNonSafepoints -Dclojure.compile.path=/
home/hdggxin/workspace/clojure-benchmark/target/classes -Dclojure-benchmark.version=0.1.0-SNAPSHOT -Dclojure.debug=false
Evaluation count : 2773980 in 60 samples of 46233 calls.
Execution time sample mean : 21.751842 µs
Execution time mean : 21.807202 µs
Execution time sample std-deviation : 2.868742 µs
Execution time std-deviation : 2.972753 µs
Execution time lower quantile : 20.927824 µs ( 2.5%)
Execution time upper quantile : 25.620445 µs (97.5%)
Overhead used : 2.454320 ns
Found 13 outliers in 60 samples (21.6667 %)
low-severe 4 (6.6667 %)
low-mild 9 (15.0000 %)
Variance from outliers : 80.7320 % Variance is severely inflated by outliers
[hdggxin@nixos:~/workspace/clojure-benchmark/profiling-result]$ cat pro-fib.txt
amd64 Linux 6.1.51 4 cpu(s)
OpenJDK 64-Bit Server VM 25.362-bga
Runtime arguments: -Dfile.encoding=UTF-8 -Djdk.attach.allowAttachSelf -XX:+UnlockDiagnosticVMOptions -XX:+DebugNonSafepoints -Dclojure.compile.path=/
home/hdggxin/workspace/clojure-benchmark/target/classes -Dclojure-benchmark.version=0.1.0-SNAPSHOT -Dclojure.debug=false
Evaluation count : 34700400 in 60 samples of 578340 calls.
Execution time sample mean : 1.807223 µs
Execution time mean : 1.808515 µs
Execution time sample std-deviation : 165.871690 ns
Execution time std-deviation : 166.917813 ns
Execution time lower quantile : 1.516781 µs ( 2.5%)
Execution time upper quantile : 2.079813 µs (97.5%)
Overhead used : 2.454320 ns
and finally the flamegraph:
please check full sample repo here: https://github.com/upendra1997/clojure-benchmark