Skip to content
Snippets Groups Projects
  1. Mar 29, 2016
    • Siarhei Siamashka's avatar
      New variants of block based C backwards copy · eb1fccd5
      Siarhei Siamashka authored
      Because some processors are sensitive to the order of memory
      accesses, add a few more variants of memory buffer backwards
      copy which do sequential memory writes in the forward direction
      inside of each sub-block of certain size. The most interesting
      sizes of such sub-blocks are 32 and 64 bytes, because they match
      the most frequently used CPU cache line sizes.
      
      Example reports:
      
      == ARM Cortex A7 ==
       C copy backwards                                     :    266.5 MB/s
       C copy backwards (32 byte blocks)                    :   1015.6 MB/s
       C copy backwards (64 byte blocks)                    :   1045.7 MB/s
       C copy                                               :   1033.3 MB/s
      
      == ARM Cortex A15 ==
       C copy backwards                                     :   1438.5 MB/s
       C copy backwards (32 byte blocks)                    :   1497.5 MB/s
       C copy backwards (64 byte blocks)                    :   2643.2 MB/s
       C copy                                               :   2985.8 MB/s
      eb1fccd5
    • Siarhei Siamashka's avatar
      Benchmark reshuffled writes to the destination buffer · ada1db8c
      Siarhei Siamashka authored
      This is expected to test the ability to do write combining for
      scattered writes and detect any possible performance penalties.
      
      Example reports:
      
      == ARM Cortex A7 ==
       C fill                                               :   4011.5 MB/s
       C fill (shuffle within 16 byte blocks)               :   4112.2 MB/s (0.3%)
       C fill (shuffle within 32 byte blocks)               :    333.9 MB/s
       C fill (shuffle within 64 byte blocks)               :    336.6 MB/s
      
      == ARM Cortex A15 ==
       C fill                                               :   6065.2 MB/s (0.4%)
       C fill (shuffle within 16 byte blocks)               :   2152.0 MB/s
       C fill (shuffle within 32 byte blocks)               :   2150.7 MB/s
       C fill (shuffle within 64 byte blocks)               :   2238.2 MB/s
      
      == ARM Cortex A53 ==
       C fill                                               :   3080.8 MB/s (0.2%)
       C fill (shuffle within 16 byte blocks)               :   3080.7 MB/s
       C fill (shuffle within 32 byte blocks)               :   3079.2 MB/s
       C fill (shuffle within 64 byte blocks)               :   3080.4 MB/s
      
      == Intel Atom N450 ==
       C fill                                               :   1554.9 MB/s
       C fill (shuffle within 16 byte blocks)               :   1554.5 MB/s
       C fill (shuffle within 32 byte blocks)               :   1553.9 MB/s
       C fill (shuffle within 64 byte blocks)               :   1554.4 MB/s
      
      See https://github.com/ssvb/tinymembench/issues/7
      ada1db8c
  2. Sep 09, 2011
  3. Sep 08, 2011
Loading