
Sat Sep 12 09:36:51 EDT 2015
numactl --interleave=all ../testing/testing_sgetrf -N 123 -N 1234 --range 10:90:10 --range 100:900:100 --range 1000:9000:1000 --range 10000:20000:2000 --lapack
% MAGMA 1.7.0  compiled for CUDA capability >= 3.5, 32-bit magma_int_t, 64-bit pointer.
% CUDA runtime 7000, driver 7000. OpenMP threads 16. MKL 11.2.2, MKL threads 16. 
% device 0: Tesla K40c, 745.0 MHz clock, 11519.6 MB memory, capability 3.5
% device 1: Tesla K40c, 745.0 MHz clock, 11519.6 MB memory, capability 3.5
% device 2: Tesla K40c, 745.0 MHz clock, 11519.6 MB memory, capability 3.5
% Sat Sep 12 09:36:59 2015
% Usage: ../testing/testing_sgetrf [options] [-h|--help]

% ngpu 1
%   M     N   CPU GFlop/s (sec)   GPU GFlop/s (sec)   |PA-LU|/(N*|A|)
%========================================================================
  123   123      0.03 (   0.04)      5.38 (   0.00)     ---   
 1234  1234    137.55 (   0.01)     37.25 (   0.03)     ---   
   10    10      0.02 (   0.00)      0.08 (   0.00)     ---   
   20    20      0.18 (   0.00)      0.26 (   0.00)     ---   
   30    30      0.46 (   0.00)      0.65 (   0.00)     ---   
   40    40      0.87 (   0.00)      1.11 (   0.00)     ---   
   50    50      1.21 (   0.00)      1.50 (   0.00)     ---   
   60    60      1.78 (   0.00)      2.15 (   0.00)     ---   
   70    70      1.88 (   0.00)      2.64 (   0.00)     ---   
   80    80      2.99 (   0.00)      3.76 (   0.00)     ---   
   90    90      3.57 (   0.00)      3.92 (   0.00)     ---   
  100   100      4.27 (   0.00)      4.84 (   0.00)     ---   
  200   200     15.86 (   0.00)     18.02 (   0.00)     ---   
  300   300     30.03 (   0.00)     10.41 (   0.00)     ---   
  400   400     48.00 (   0.00)     20.52 (   0.00)     ---   
  500   500     64.40 (   0.00)     31.15 (   0.00)     ---   
  600   600     81.72 (   0.00)     40.38 (   0.00)     ---   
  700   700     97.21 (   0.00)     53.75 (   0.00)     ---   
  800   800    104.67 (   0.00)     64.51 (   0.01)     ---   
  900   900    115.97 (   0.00)     78.78 (   0.01)     ---   
 1000  1000    122.10 (   0.01)     93.89 (   0.01)     ---   
 2000  2000    169.65 (   0.03)    238.47 (   0.02)     ---   
 3000  3000    202.61 (   0.09)    407.86 (   0.04)     ---   
 4000  4000    298.01 (   0.14)    586.66 (   0.07)     ---   
 5000  5000    321.67 (   0.26)    726.01 (   0.11)     ---   
 6000  6000    315.50 (   0.46)    883.65 (   0.16)     ---   
 7000  7000    408.89 (   0.56)   1026.42 (   0.22)     ---   
 8000  8000    397.10 (   0.86)   1162.66 (   0.29)     ---   
 9000  9000    351.10 (   1.38)   1268.01 (   0.38)     ---   
10000 10000    441.06 (   1.51)   1360.18 (   0.49)     ---   
12000 12000    344.52 (   3.34)   1502.10 (   0.77)     ---   
14000 14000    377.47 (   4.85)   1600.32 (   1.14)     ---   
16000 16000    361.38 (   7.56)   1684.20 (   1.62)     ---   
18000 18000    372.39 (  10.44)   1762.74 (   2.21)     ---   
20000 20000    363.56 (  14.67)   1916.62 (   2.78)     ---   
Sat Sep 12 09:38:54 EDT 2015

Sat Sep 12 09:38:54 EDT 2015
numactl --interleave=all ../testing/testing_sgetrf_gpu -N 123 -N 1234 --range 10:90:10 --range 100:900:100 --range 1000:9000:1000 --range 10000:20000:2000
% MAGMA 1.7.0  compiled for CUDA capability >= 3.5, 32-bit magma_int_t, 64-bit pointer.
% CUDA runtime 7000, driver 7000. OpenMP threads 16. MKL 11.2.2, MKL threads 16. 
% device 0: Tesla K40c, 745.0 MHz clock, 11519.6 MB memory, capability 3.5
% device 1: Tesla K40c, 745.0 MHz clock, 11519.6 MB memory, capability 3.5
% device 2: Tesla K40c, 745.0 MHz clock, 11519.6 MB memory, capability 3.5
% Sat Sep 12 09:39:00 2015
% Usage: ../testing/testing_sgetrf_gpu [options] [-h|--help]

%   M     N   CPU GFlop/s (sec)   GPU GFlop/s (sec)   |PA-LU|/(N*|A|)
%========================================================================
  123   123     ---   (  ---  )      0.86 (   0.00)     ---  
 1234  1234     ---   (  ---  )    111.42 (   0.01)     ---  
   10    10     ---   (  ---  )      0.01 (   0.00)     ---  
   20    20     ---   (  ---  )      0.05 (   0.00)     ---  
   30    30     ---   (  ---  )      0.24 (   0.00)     ---  
   40    40     ---   (  ---  )      0.44 (   0.00)     ---  
   50    50     ---   (  ---  )      0.85 (   0.00)     ---  
   60    60     ---   (  ---  )      1.36 (   0.00)     ---  
   70    70     ---   (  ---  )      1.29 (   0.00)     ---  
   80    80     ---   (  ---  )      2.30 (   0.00)     ---  
   90    90     ---   (  ---  )      2.47 (   0.00)     ---  
  100   100     ---   (  ---  )      2.88 (   0.00)     ---  
  200   200     ---   (  ---  )      9.40 (   0.00)     ---  
  300   300     ---   (  ---  )      8.55 (   0.00)     ---  
  400   400     ---   (  ---  )     15.82 (   0.00)     ---  
  500   500     ---   (  ---  )     25.36 (   0.00)     ---  
  600   600     ---   (  ---  )     35.06 (   0.00)     ---  
  700   700     ---   (  ---  )     48.80 (   0.00)     ---  
  800   800     ---   (  ---  )     59.99 (   0.01)     ---  
  900   900     ---   (  ---  )     75.32 (   0.01)     ---  
 1000  1000     ---   (  ---  )     89.67 (   0.01)     ---  
 2000  2000     ---   (  ---  )    258.38 (   0.02)     ---  
 3000  3000     ---   (  ---  )    451.44 (   0.04)     ---  
 4000  4000     ---   (  ---  )    652.98 (   0.07)     ---  
 5000  5000     ---   (  ---  )    759.29 (   0.11)     ---  
 6000  6000     ---   (  ---  )    944.32 (   0.15)     ---  
 7000  7000     ---   (  ---  )    992.41 (   0.23)     ---  
 8000  8000     ---   (  ---  )   1290.98 (   0.26)     ---  
 9000  9000     ---   (  ---  )   1445.80 (   0.34)     ---  
10000 10000     ---   (  ---  )   1458.11 (   0.46)     ---  
12000 12000     ---   (  ---  )   1707.08 (   0.67)     ---  
14000 14000     ---   (  ---  )   1832.60 (   1.00)     ---  
16000 16000     ---   (  ---  )   1910.36 (   1.43)     ---  
18000 18000     ---   (  ---  )   1977.86 (   1.97)     ---  
20000 20000     ---   (  ---  )   2154.23 (   2.48)     ---  
Sat Sep 12 09:39:40 EDT 2015
