Difference between revisions of "Tests of a SIMD version of DVector3"

From GlueXWiki
Jump to: navigation, search
Line 1: Line 1:
 
* SIMD (Single-Instruction Multiple-Data) instructions provide means of doing multiple calculations with or manipulations of several pieces of data with a single operation
 
* SIMD (Single-Instruction Multiple-Data) instructions provide means of doing multiple calculations with or manipulations of several pieces of data with a single operation
 
** Size of special registers is 128 bits:  can do 4 single-precision floating point calculations (SSE) or 2 double-precision floating point calculations (SSE2) at the same time  
 
** Size of special registers is 128 bits:  can do 4 single-precision floating point calculations (SSE) or 2 double-precision floating point calculations (SSE2) at the same time  
** Basic arithmetic and logical operations are supported, as are square roots and reciprocal square roots
+
** Basic arithmetic and logical operations are supported, as are square roots
 
* Implemented DVector3 with SSE2 instructions
 
* Implemented DVector3 with SSE2 instructions
 
** Three pairs of coordinates: ''xy'',''yz'',''zx''
 
** Three pairs of coordinates: ''xy'',''yz'',''zx''

Revision as of 11:18, 4 May 2010

  • SIMD (Single-Instruction Multiple-Data) instructions provide means of doing multiple calculations with or manipulations of several pieces of data with a single operation
    • Size of special registers is 128 bits: can do 4 single-precision floating point calculations (SSE) or 2 double-precision floating point calculations (SSE2) at the same time
    • Basic arithmetic and logical operations are supported, as are square roots
  • Implemented DVector3 with SSE2 instructions
    • Three pairs of coordinates: xy,yz,zx
    • gcc compiler flags: -msse2 -mfpmath=sse -O2
  • Compare to TVector3 on ifarml6: 10 million events with randomly generated vectors
Operation        DVector3 time (s)    TVector3 time (s)     T3/D3
-----------------------------------------------------------------
 v2=-v1               0.054                 0.171            3.17
 v3=v1+v2             0.092                 0.191            2.08
 v3+=v1               0.056                 0.048            0.86 <-
 v3=v1-v2             0.092                 0.194            2.11
 v2=k v1              0.054                 0.174            3.24   
 v2*=k                0.030                 0.018            0.60 <-
 SetMag               0.198                 0.198            1.00
 SetXYZ               0.007                 0.008            1.14
 SetMagThetaPhi       0.680                 0.890            1.31  Expensive!
 Rotate(a,v)          0.625                 1.528            2.44   "
 RotateZ(a)           0.353                 0.402            1.14
 make orthogonal v    0.142                 0.273            1.92
 v3 = v1 x v2         0.107                 0.193            1.80