-
Notifications
You must be signed in to change notification settings - Fork 2
/
results.txt
203 lines (179 loc) · 11.2 KB
/
results.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
A log of informal measurements.
[2011.01.31] {First results from a Westmere machine}
How many random numbers can we generate in a second on one thread?
Cost of rdtsc (ffi call): 84
Approx getCPUTime calls per second: 209,798
Approx clock frequency: 3,331,093,772
First, timing with System.Random interface:
76,811,104 random ints generated [constant zero gen] ~ 43.37 cycles/int
14,482,725 random ints generated [System.Random stdGen] ~ 230 cycles/int
16,061 random ints generated [PureHaskell/reference] ~ 207,403 cycles/int
32,309 random ints generated [PureHaskell] ~ 103,101 cycles/int
2,401,893 random ints generated [Gladman inefficient] ~ 1,387 cycles/int
15,980,625 random ints generated [Gladman] ~ 208 cycles/int
2,329,500 random ints generated [IntelAES inefficient] ~ 1,430 cycles/int
32,383,799 random ints generated [IntelAES] ~ 103 cycles/int
Comparison to C's rand():
595,263,182 random ints generated [ptr store in C loop] ~ 5.60 cycles/int
71,392,408 random ints generated [rand/store in C loop] ~ 46.66 cycles/int
71,347,778 random ints generated [rand in Haskell loop] ~ 46.69 cycles/int
71,324,158 random ints generated [rand/store in Haskell loop] ~ 46.70 cycles/int
Finished.
---------------------------------------------------------
processor : 21
vendor_id : GenuineIntel
cpu family : 6
model : 44
model name : Intel(R) Xeon(R) CPU X5680 @ 3.33GHz
stepping : 2
cpu MHz : 1600.000
cache size : 12288 KB
physical id : 1
siblings : 12
core id : 8
cpu cores : 6
apicid : 49
initial apicid : 49
fpu : yes
fpu_exception : yes
cpuid level : 11
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx
pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology
nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est
tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 popcnt aes lahf_lm ida arat
tpr_shadow vnmi flexpriority ept vpid
bogomips : 6648.89
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management:
[2011.02.01] {Results from another westmere with multithreading}
Here's degradation under 4 threads:
How many random numbers can we generate in a second on one thread?
Cost of rdtsc (ffi call): 83
COUNTERS WRAPPED (4294963592,11072)
Approx getCPUTime calls per second: 165,900
WARNING: rdtsc not monotonically increasing, first 2292070581 then 717441175 on the same OS thread
Approx clock frequency: -1,574,629,406
First, timing with System.Random interface:
65,858,418 random ints generated [constant zero gen] ~ -23.91 cycles/int
12,488,802 random ints generated [System.Random stdGen] ~ -126.08 cycles/int
14,301 random ints generated [PureHaskell/reference] ~ -110106.24 cycles/int
29,499 random ints generated [PureHaskell] ~ -53379.08 cycles/int
2,040,034 random ints generated [Gladman inefficient] ~ -771.86 cycles/int
14,325,249 random ints generated [Gladman] ~ -109.92 cycles/int
2,096,602 random ints generated [IntelAES inefficient] ~ -751.04 cycles/int
26,548,561 random ints generated [IntelAES] ~ -59.31 cycles/int
Comparison to C's rand():
598,766,703 random ints generated [ptr store in C loop] ~ -2.63 cycles/int
71,964,505 random ints generated [rand/store in C loop] ~ -21.88 cycles/int
71,954,027 random ints generated [rand in Haskell loop] ~ -21.88 cycles/int
71,954,389 random ints generated [rand/store in Haskell loop] ~ -21.88 cycles/int
Now 4 threads, reporting mean randoms-per-second-per-thread:
First, timing with System.Random interface:
63,512,528 random ints generated [constant zero gen] ~ -24.79 cycles/int
12,128,406 random ints generated [System.Random stdGen] ~ -129.83 cycles/int
14,128 random ints generated [PureHaskell/reference] ~ -111452.54 cycles/int
28,352 random ints generated [PureHaskell] ~ -55539.54 cycles/int
1,451,664 random ints generated [Gladman inefficient] ~ -1084.71 cycles/int
13,397,974 random ints generated [Gladman] ~ -117.53 cycles/int
1,582,927 random ints generated [IntelAES inefficient] ~ -994.76 cycles/int
22,284,215 random ints generated [IntelAES] ~ -70.66 cycles/int
Comparison to C's rand():
321,710,129 random ints generated [ptr store in C loop] ~ -4.89 cycles/int
1,124,522 random ints generated [rand/store in C loop] ~ -1400.27 cycles/int
1,126,719 random ints generated [rand in Haskell loop] ~ -1397.54 cycles/int
1,473,265 random ints generated [rand/store in Haskell loop] ~ -1068.80 cycles/int
And 12 threads:
Now 12 threads, reporting mean randoms-per-second-per-thread:
First, timing with System.Random interface:
51,512,538 random ints generated [constant zero gen] ~ 20.67 cycles/int
10,422,615 random ints generated [System.Random stdGen] ~ 102 cycles/int
10,558 random ints generated [PureHaskell/reference] ~ 100,857 cycles/int
21,091 random ints generated [PureHaskell] ~ 50,489 cycles/int
1,081,750 random ints generated [Gladman inefficient] ~ 984 cycles/int
6,143,601 random ints generated [Gladman] ~ 173 cycles/int
1,037,113 random ints generated [IntelAES inefficient] ~ 1,027 cycles/int
6,807,703 random ints generated [IntelAES] ~ 156 cycles/int
Comparison to C's rand():
320,317,573 random ints generated [ptr store in C loop] ~ 3.32 cycles/int
670,323 random ints generated [rand/store in C loop] ~ 1,589 cycles/int
712,223 random ints generated [rand in Haskell loop] ~ 1,495 cycles/int
644,107 random ints generated [rand/store in Haskell loop] ~ 1,653 cycles/int
81M randoms total per second, worse than the 4 thread 88.8M per second.
And 24 threads (hyperthreading?)
Now 24 threads, reporting mean randoms-per-second-per-thread:
First, timing with System.Random interface:
14,270,704 random ints generated [constant zero gen] ~ 29.08 cycles/int
3,089,394 random ints generated [System.Random stdGen] ~ 134 cycles/int
3,938 random ints generated [PureHaskell/reference] ~ 105,406 cycles/int
7,916 random ints generated [PureHaskell] ~ 52,436 cycles/int
510,410 random ints generated [Gladman inefficient] ~ 813 cycles/int
2,418,117 random ints generated [Gladman] ~ 172 cycles/int
454,070 random ints generated [IntelAES inefficient] ~ 914 cycles/int
2,119,325 random ints generated [IntelAES] ~ 196 cycles/int
Comparison to C's rand():
195,244,368 random ints generated [ptr store in C loop] ~ 2.13 cycles/int
391,122 random ints generated [rand/store in C loop] ~ 1,061 cycles/int
442,710 random ints generated [rand in Haskell loop] ~ 938 cycles/int
393,143 random ints generated [rand/store in Haskell loop] ~ 1,056 cycles/int
[2012.04.06] {Results on different machines, now at IU}
Westmere, 3.1 ghz:
Does machine supports AESNI?: True
How many random numbers can we generate in a second on one thread?
Cost of rdtsc (ffi call): 86
Approx getCPUTime calls per second: 896,888
Approx clock frequency: 3,094,658,558
First, timing with System.Random interface:
55,369,295 random ints generated [constant zero gen] ~ 55.89 cycles/int
14,390,258 random ints generated [System.Random stdGen] ~ 215 cycles/int
17,578 random ints generated [PureHaskell AES/reference] ~ 176,053 cycles/int
35,335 random ints generated [PureHaskell AES] ~ 87,581 cycles/int
2,653,915 random ints generated [Gladman unbuffered] ~ 1,166 cycles/int
14,909,583 random ints generated [Gladman] ~ 208 cycles/int
29,158,273 random ints generated [Compound gladman/intel] ~ 106 cycles/int
2,634,862 random ints generated [IntelAES unbuffered] ~ 1,175 cycles/int
29,375,877 random ints generated [IntelAES] ~ 105 cycles/int
Comparison to C's rand():
120,363,481 random ints generated [rand in Haskell loop] ~ 25.71 cycles/int
116,238,187 random ints generated [rand/store in Haskell loop] ~ 26.62 cycles/int
Intel(R) Xeon(R) CPU X7350 @ 2.93GHz (hulk)
Does machine supports AESNI?: False
How many random numbers can we generate in a second on one thread?
Cost of rdtsc (ffi call): 187
Approx getCPUTime calls per second: 687,196
Approx clock frequency: 2,935,214,436
First, timing with System.Random interface:
47,480,971 random ints generated [constant zero gen] ~ 61.82 cycles/int
8,947,332 random ints generated [System.Random stdGen] ~ 328 cycles/int
15,760 random ints generated [PureHaskell AES/reference] ~ 186,245 cycles/int
31,587 random ints generated [PureHaskell AES] ~ 92,925 cycles/int
1,970,607 random ints generated [Gladman unbuffered] ~ 1,489 cycles/int
12,235,777 random ints generated [Gladman] ~ 240 cycles/int
12,402,738 random ints generated [Compound gladman/intel] ~ 237 cycles/int
[Skipping AESNI-only tests, current machine does not support these instructions.]
Comparison to C's rand():
103,988,751 random ints generated [rand in Haskell loop] ~ 28.23 cycles/int
100,400,038 random ints generated [rand/store in Haskell loop] ~ 29.24 cycles/int
Finished.
Quad-Core AMD Opteron(tm) Processor 8356 (idun)
Does machine supports AESNI?: False
How many random numbers can we generate in a second on one thread?
Cost of rdtsc (ffi call): 265
Approx getCPUTime calls per second: 337,514
Approx clock frequency: 2,301,993,766
First, timing with System.Random interface:
28,543,173 random ints generated [constant zero gen] ~ 80.65 cycles/int
5,990,531 random ints generated [System.Random stdGen] ~ 384 cycles/int
8,747 random ints generated [PureHaskell/reference] ~ 263,175 cycles/int
17,895 random ints generated [PureHaskell] ~ 128,639 cycles/int
692,578 random ints generated [Gladman unbuffered] ~ 3,324 cycles/int
5,989,430 random ints generated [Gladman] ~ 384 cycles/int
6,069,723 random ints generated [Compound gladman/intel] ~ 379 cycles/int
[Skipping AESNI-only tests, current machine does not support these instructions.]
Comparison to C's rand():
78,674,348 random ints generated [rand in Haskell loop] ~ 29.26 cycles/int
77,362,365 random ints generated [rand/store in Haskell loop] ~ 29.76 cycles/int
Finished.