Skip to content
GitLab
Menu
Projects
Groups
Snippets
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
On Thursday, 7th July from 1 to 3 pm there will be a maintenance with a short downtime of GitLab.
Open sidebar
elpa
elpa
Commits
c602e2bf
Commit
c602e2bf
authored
Apr 14, 2016
by
Andreas Marek
Browse files
Single precision SSE BLOCK 4 real kernel
parent
47da2281
Changes
2
Expand all
Hide whitespace changes
Inline
Side-by-side
src/elpa2_kernels/elpa2_kernels_real_sse_4hv_single_precision.c
View file @
c602e2bf
This diff is collapsed.
Click to expand it.
src/mod_compute_hh_trafo_real.F90
View file @
c602e2bf
...
...
@@ -850,7 +850,6 @@ module compute_hh_trafo_real
do
j
=
ncols
,
2
,
-2
w
(:,
1
)
=
bcast_buffer
(
1
:
nbw
,
j
+
off
)
w
(:,
2
)
=
bcast_buffer
(
1
:
nbw
,
j
+
off
-1
)
print
*
,
"calling sse block2"
#ifdef WITH_OPENMP
call
double_hh_trafo_real_sse_2hv_single
(
a
(
1
,
j
+
off
+
a_off
-1
,
istripe
,
my_thread
),
&
w
,
nbw
,
nl
,
stripe_width
,
nbw
)
...
...
@@ -953,7 +952,6 @@ module compute_hh_trafo_real
#endif /* WITH_NO_SPECIFIC_REAL_KERNEL */
! X86 INTRINSIC CODE, USING 4 HOUSEHOLDER VECTORS
do
j
=
ncols
,
4
,
-4
print
*
,
"calling 4"
w
(:,
1
)
=
bcast_buffer
(
1
:
nbw
,
j
+
off
)
w
(:,
2
)
=
bcast_buffer
(
1
:
nbw
,
j
+
off
-1
)
w
(:,
3
)
=
bcast_buffer
(
1
:
nbw
,
j
+
off
-2
)
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment