According to the readme, Rust is supported, did anyone tried and noticed improvement? rui314/mold: Mold: A Modern Linker 🦠 https://github.com/rui314/mold

  • MoSal@lemm.ee
    link
    fedilink
    English
    arrow-up
    0
    ·
    edit-2
    1 year ago

    Okay. I updated mold to v2.0.0. Added "-Z", "time-passes" to get link times, ran cargo with --timings to get CPU utilization graphs. Tested on two projects of mine (the one from yesterday is “X”).

    Link times are picked as the best from 3-4 runs, changing only white space on main.rs.

    lto="fat" lld mold
    project X (cu=1) 105.923 106.380
    Project X (cu=8) 103.512 103.513
    Project S (cu=1) 94.290 94.969
    Project S (cu=8) 100.118 100.449

    Observations (lto="fat"): As expected, not a lot of utilization of multi-core. Using codegen-units larger than 1 may even cause a regression in link time. Choice of linker between lld and mold appears to be of no significance.


    lto="thin" lld mold
    project X (cu=1) 46.596 47.118
    Project X (cu=8) 34.167 33.839
    Project X (cu=16) 36.296 36.621
    Project S (cu=1) 41.817 41.404
    Project S (cu=8) 32.062 32.162
    Project S (cu=16) 35.780 36.074

    Observations (lto="thin"): Here, we see parallel LLVM_lto_optimize runs kicking in. Testing with codegen-units=16 was also done. In that case, the number of parallel LLVM_lto_optimize runs was so big, the synchronization overhead caused a regression running that test on a humble workstation powered by an Intel i7-7700K processor (4 physical, 8 logical cores only). The results will probably look different running this test case (cu=16) in a more powerful setup. But still, the choice of linker between lld and mold appears to be of no significance.


    lto=false lld mold
    project X (cu=1) 29.160 29.231
    Project X (cu=8) 8.130 8.293
    Project X (cu=16) 7.076 6.953
    Project S (cu=1) 11.996 12.069
    Project S (cu=8) 4.418 4.462
    Project S (cu=16) 4.357 4.455

    Observations (lto=false): Here, codegen-units becomes the dominant factor with no heavy LLVM_lto_optimize runs involved. Going above codegen-units=8 does not hurt link time. Still, the choice of linker between lld and mold appears to be of no significance.


    lto="off" lld mold
    project X (cu=1) 29.109 29.201
    Project X (cu=8) 5.896 6.117
    Project X (cu=16) 3.479 3.637
    Project S (cu=1) 11.732 11.742
    Project S (cu=8) 2.354 2.355
    Project S (cu=16) 1.517 1.499

    Observations (lto="off"): Same observations as lto=false. Still, the choice of linker between lld and mold appears to be of no significance.


    Debug builds link in <.4 seconds.