Skip to content
GitLab
Projects
Groups
Snippets
/
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
elpa
elpa
Commits
5d2e456f
Unverified
Commit
5d2e456f
authored
May 03, 2016
by
Andreas Marek
Browse files
Update file headers for single precision kernels
parent
665a95d2
Changes
11
Pipelines
1
Hide whitespace changes
Inline
Side-by-side
src/elpa2_kernels/elpa2_kernels_asm_x86_64_single_precision.s
View file @
5d2e456f
#
--------------------------------------------------------------------------------------------------
#
This
file
is
part
of
ELPA
.
#
#
#
This
file
contains
the
compute
intensive
kernels
for
the
Householder
tra
ns
f
or
mations
,
#
The
ELPA
library
was
originally
created
by
the
ELPA
co
nsor
tium
,
#
coded
in
x86_64
assembler
and
using
SSE2
/
SSE3
instruc
tions
.
#
consisting
of
the
following
organiza
tions
:
#
#
#
It
must
be
assembled
with
GNU
assembler
(
just
"as"
on
most
Linux
machines
)
#
-
Max
Planck
Computing
and
Data
Facility
(
MPCDF
),
formerly
known
as
#
Rechenzentrum
Garching
der
Max
-
Planck
-
Gesellschaft
(
RZG
),
#
-
Bergische
Universit
ä
t
Wuppertal
,
Lehrstuhl
f
ü
r
angewandte
#
Informatik
,
#
-
Technische
Universit
ä
t
M
ü
nchen
,
Lehrstuhl
f
ü
r
Informatik
mit
#
Schwerpunkt
Wissenschaftliches
Rechnen
,
#
-
Fritz
-
Haber
-
Institut
,
Berlin
,
Abt
.
Theorie
,
#
-
Max
-
Plack
-
Institut
f
ü
r
Mathematik
in
den
Naturwissenschaftrn
,
#
Leipzig
,
Abt
.
Komplexe
Strukutren
in
Biologie
und
Kognition
,
#
and
#
-
IBM
Deutschland
GmbH
#
#
#
Copyright
of
the
original
code
rests
with
the
authors
inside
the
ELPA
#
consortium
.
The
copyright
of
any
additional
modifications
shall
rest
#
with
their
original
authors
,
but
shall
adhere
to
the
licensing
terms
#
distributed
along
with
the
original
code
in
the
file
"COPYING"
.
#
#
#
--------------------------------------------------------------------------------------------------
#
More
information
can
be
found
here
:
#
http
://
elpa.mpcdf.mpg.de
/
#
#
ELPA
is
free
software
:
you
can
redistribute
it
and
/
or
modify
#
it
under
the
terms
of
the
version
3
of
the
license
of
the
#
GNU
Lesser
General
Public
License
as
published
by
the
Free
#
Software
Foundation
.
#
#
ELPA
is
distributed
in
the
hope
that
it
will
be
useful
,
#
but
WITHOUT
ANY
WARRANTY
; without even the implied warranty of
#
MERCHANTABILITY
or
FITNESS
FOR
A
PARTICULAR
PURPOSE
.
See
the
#
GNU
Lesser
General
Public
License
for
more
details
.
#
#
You
should
have
received
a
copy
of
the
GNU
Lesser
General
Public
License
#
along
with
ELPA
.
If
not
,
see
<
http
:
//
www
.
gnu
.
org
/
licenses
/>
#
#
ELPA
reflects
a
substantial
effort
on
the
part
of
the
original
#
ELPA
consortium
,
and
we
ask
you
to
respect
the
spirit
of
the
#
license
that
we
chose
:
i
.
e
.
,
please
contribute
any
changes
you
#
may
have
back
to
the
original
ELPA
library
distribution
,
and
keep
#
any
derivatives
of
ELPA
under
the
same
license
that
we
chose
for
#
the
original
distribution
,
the
GNU
Lesser
General
Public
License
.
#
#
Author
:
Andreas
Marek
,
MPCDF
.
globl
double_hh_trafo_single_
.
globl
double_hh_trafo_single_
.
globl
single_hh_trafo_complex_single_
.
globl
single_hh_trafo_complex_single_
...
...
src/elpa2_kernels/elpa2_kernels_complex_avx-avx2_1hv_single_precision.c
View file @
5d2e456f
...
@@ -42,23 +42,8 @@
...
@@ -42,23 +42,8 @@
// any derivatives of ELPA under the same license that we chose for
// any derivatives of ELPA under the same license that we chose for
// the original distribution, the GNU Lesser General Public License.
// the original distribution, the GNU Lesser General Public License.
//
//
// Author: Andreas Marek, MPCDF, based on the double precision case of A. Heinecke
//
//
// --------------------------------------------------------------------------------------------------
//
// This file contains the compute intensive kernels for the Householder transformations.
// It should be compiled with the highest possible optimization level.
//
// On Intel Nehalem or Intel Westmere or AMD Magny Cours use -O3 -msse3
// On Intel Sandy Bridge use -O3 -mavx
//
// Copyright of the original code rests with the authors inside the ELPA
// consortium. The copyright of any additional modifications shall rest
// with their original authors, but shall adhere to the licensing terms
// distributed along with the original code in the file "COPYING".
//
// Author: Alexander Heinecke (alexander.heinecke@mytum.de)
// Adapted for building a shared-library by Andreas Marek, MPCDF (andreas.marek@mpcdf.mpg.de)
// --------------------------------------------------------------------------------------------------
#include
"config-f90.h"
#include
"config-f90.h"
#include
<complex.h>
#include
<complex.h>
...
...
src/elpa2_kernels/elpa2_kernels_complex_avx-avx2_2hv_single_precision.c
View file @
5d2e456f
...
@@ -42,23 +42,8 @@
...
@@ -42,23 +42,8 @@
// any derivatives of ELPA under the same license that we chose for
// any derivatives of ELPA under the same license that we chose for
// the original distribution, the GNU Lesser General Public License.
// the original distribution, the GNU Lesser General Public License.
//
//
// Author: Andreas Marek, MPCDF, based on the double precision case of A. Heinecke
//
//
// --------------------------------------------------------------------------------------------------
//
// This file contains the compute intensive kernels for the Householder transformations.
// It should be compiled with the highest possible optimization level.
//
// On Intel Nehalem or Intel Westmere or AMD Magny Cours use -O3 -msse3
// On Intel Sandy Bridge use -O3 -mavx
//
// Copyright of the original code rests with the authors inside the ELPA
// consortium. The copyright of any additional modifications shall rest
// with their original authors, but shall adhere to the licensing terms
// distributed along with the original code in the file "COPYING".
//
// Author: Alexander Heinecke (alexander.heinecke@mytum.de)
// Adapted for building a shared-library by Andreas Marek, MPCDF (andreas.marek@mpcdf.mpg.de)
// --------------------------------------------------------------------------------------------------
#include
"config-f90.h"
#include
"config-f90.h"
#include
<complex.h>
#include
<complex.h>
...
...
src/elpa2_kernels/elpa2_kernels_complex_sse_1hv_single_precision.c
View file @
5d2e456f
...
@@ -42,24 +42,8 @@
...
@@ -42,24 +42,8 @@
// any derivatives of ELPA under the same license that we chose for
// any derivatives of ELPA under the same license that we chose for
// the original distribution, the GNU Lesser General Public License.
// the original distribution, the GNU Lesser General Public License.
//
//
// Author: Andreas Marek, MPCDF, based on the double precision case of A. Heinecke
//
//
// --------------------------------------------------------------------------------------------------
//
// This file contains the compute intensive kernels for the Householder transformations.
// It should be compiled with the highest possible optimization level.
//
// On Intel Nehalem or Intel Westmere or AMD Magny Cours use -O3 -msse3
// On Intel Sandy Bridge use -O3 -mavx
//
// Copyright of the original code rests with the authors inside the ELPA
// consortium. The copyright of any additional modifications shall rest
// with their original authors, but shall adhere to the licensing terms
// distributed along with the original code in the file "COPYING".
//
// Author: Alexander Heinecke (alexander.heinecke@mytum.de)
// Adapted for building a shared-library by Andreas Marek, MPCDF (andreas.marek@mpcdf.mpg.de)
// --------------------------------------------------------------------------------------------------
#include
"config-f90.h"
#include
"config-f90.h"
#include
<complex.h>
#include
<complex.h>
...
...
src/elpa2_kernels/elpa2_kernels_complex_sse_2hv_single_precision.c
View file @
5d2e456f
...
@@ -42,23 +42,8 @@
...
@@ -42,23 +42,8 @@
// any derivatives of ELPA under the same license that we chose for
// any derivatives of ELPA under the same license that we chose for
// the original distribution, the GNU Lesser General Public License.
// the original distribution, the GNU Lesser General Public License.
//
//
// Author: Andreas Marek, MPCDF, based on the double precision case of A. Heinecke
//
//
// --------------------------------------------------------------------------------------------------
//
// This file contains the compute intensive kernels for the Householder transformations.
// It should be compiled with the highest possible optimization level.
//
// On Intel Nehalem or Intel Westmere or AMD Magny Cours use -O3 -msse3
// On Intel Sandy Bridge use -O3 -mavx
//
// Copyright of the original code rests with the authors inside the ELPA
// consortium. The copyright of any additional modifications shall rest
// with their original authors, but shall adhere to the licensing terms
// distributed along with the original code in the file "COPYING".
//
// Author: Alexander Heinecke (alexander.heinecke@mytum.de)
// Adapted for building a shared-library by Andreas Marek, MPCDF (andreas.marek@mpcdf.mpg.de)
// --------------------------------------------------------------------------------------------------
#include
"config-f90.h"
#include
"config-f90.h"
#include
<complex.h>
#include
<complex.h>
...
...
src/elpa2_kernels/elpa2_kernels_real_avx-avx2_2hv_single_precision.c
View file @
5d2e456f
...
@@ -42,24 +42,8 @@
...
@@ -42,24 +42,8 @@
// any derivatives of ELPA under the same license that we chose for
// any derivatives of ELPA under the same license that we chose for
// the original distribution, the GNU Lesser General Public License.
// the original distribution, the GNU Lesser General Public License.
//
//
// Author: Andreas Marek, MPCDF, based on the double precision case of A. Heinecke
//
//
// --------------------------------------------------------------------------------------------------
//
// This file contains the compute intensive kernels for the Householder transformations.
// It should be compiled with the highest possible optimization level.
//
// On Intel Nehalem or Intel Westmere or AMD Magny Cours use -O3 -msse3
// On Intel Sandy Bridge use -O3 -mavx
//
// Copyright of the original code rests with the authors inside the ELPA
// consortium. The copyright of any additional modifications shall rest
// with their original authors, but shall adhere to the licensing terms
// distributed along with the original code in the file "COPYING".
//
// Author: Alexander Heinecke (alexander.heinecke@mytum.de)
// Adapted for building a shared-library by Andreas Marek, MPCDF (andreas.marek@mpcdf.mpg.de)
// --------------------------------------------------------------------------------------------------
#include
"config-f90.h"
#include
"config-f90.h"
#include
<x86intrin.h>
#include
<x86intrin.h>
...
...
src/elpa2_kernels/elpa2_kernels_real_avx-avx2_4hv_single_precision.c
View file @
5d2e456f
...
@@ -42,23 +42,8 @@
...
@@ -42,23 +42,8 @@
// any derivatives of ELPA under the same license that we chose for
// any derivatives of ELPA under the same license that we chose for
// the original distribution, the GNU Lesser General Public License.
// the original distribution, the GNU Lesser General Public License.
//
//
// Author: Andreas Marek, MPCDF, based on the double precision case of A. Heinecke
//
//
// --------------------------------------------------------------------------------------------------
//
// This file contains the compute intensive kernels for the Householder transformations.
// It should be compiled with the highest possible optimization level.
//
// On Intel Nehalem or Intel Westmere or AMD Magny Cours use -O3 -msse3
// On Intel Sandy Bridge use -O3 -mavx
//
// Copyright of the original code rests with the authors inside the ELPA
// consortium. The copyright of any additional modifications shall rest
// with their original authors, but shall adhere to the licensing terms
// distributed along with the original code in the file "COPYING".
//
// Author: Alexander Heinecke (alexander.heinecke@mytum.de)
// Adapted for building a shared-library by Andreas Marek, MPCDF (andreas.marek@mpcdf.mpg.de)
// --------------------------------------------------------------------------------------------------
#include
"config-f90.h"
#include
"config-f90.h"
#include
<x86intrin.h>
#include
<x86intrin.h>
...
...
src/elpa2_kernels/elpa2_kernels_real_avx-avx2_6hv_single_precision.c
View file @
5d2e456f
...
@@ -42,24 +42,8 @@
...
@@ -42,24 +42,8 @@
// any derivatives of ELPA under the same license that we chose for
// any derivatives of ELPA under the same license that we chose for
// the original distribution, the GNU Lesser General Public License.
// the original distribution, the GNU Lesser General Public License.
//
//
// Author: Andreas Marek, MPCDF, based on the double precision case of A. Heinecke
//
//
// --------------------------------------------------------------------------------------------------
//
// This file contains the compute intensive kernels for the Householder transformations.
// It should be compiled with the highest possible optimization level.
//
// On Intel Nehalem or Intel Westmere or AMD Magny Cours use -O3 -msse3
// On Intel Sandy Bridge use -O3 -mavx
//
// Copyright of the original code rests with the authors inside the ELPA
// consortium. The copyright of any additional modifications shall rest
// with their original authors, but shall adhere to the licensing terms
// distributed along with the original code in the file "COPYING".
//
// Author: Alexander Heinecke (alexander.heinecke@mytum.de)
// Adapted for building a shared-library by Andreas Marek, MPCDF (andreas.marek@mpcdf.mpg.de)
// --------------------------------------------------------------------------------------------------
#include
"config-f90.h"
#include
"config-f90.h"
#include
<x86intrin.h>
#include
<x86intrin.h>
...
...
src/elpa2_kernels/elpa2_kernels_real_sse_2hv_single_precision.c
View file @
5d2e456f
...
@@ -42,24 +42,8 @@
...
@@ -42,24 +42,8 @@
// any derivatives of ELPA under the same license that we chose for
// any derivatives of ELPA under the same license that we chose for
// the original distribution, the GNU Lesser General Public License.
// the original distribution, the GNU Lesser General Public License.
//
//
// Author: Andreas Marek, MPCDF, based on the double precision case of A. Heinecke
//
//
// --------------------------------------------------------------------------------------------------
//
// This file contains the compute intensive kernels for the Householder transformations.
// It should be compiled with the highest possible optimization level.
//
// On Intel Nehalem or Intel Westmere or AMD Magny Cours use -O3 -msse3
// On Intel Sandy Bridge use -O3 -mavx
//
// Copyright of the original code rests with the authors inside the ELPA
// consortium. The copyright of any additional modifications shall rest
// with their original authors, but shall adhere to the licensing terms
// distributed along with the original code in the file "COPYING".
//
// Author: Alexander Heinecke (alexander.heinecke@mytum.de)
// Adapted for building a shared-library by Andreas Marek, MPCDF (andreas.marek@mpcdf.mpg.de)
// --------------------------------------------------------------------------------------------------
#include
"config-f90.h"
#include
"config-f90.h"
#include
<x86intrin.h>
#include
<x86intrin.h>
...
...
src/elpa2_kernels/elpa2_kernels_real_sse_4hv_single_precision.c
View file @
5d2e456f
...
@@ -42,24 +42,8 @@
...
@@ -42,24 +42,8 @@
// any derivatives of ELPA under the same license that we chose for
// any derivatives of ELPA under the same license that we chose for
// the original distribution, the GNU Lesser General Public License.
// the original distribution, the GNU Lesser General Public License.
//
//
// Author: Andreas Marek, MPCDF, based on the double precision case of A. Heinecke
//
//
// --------------------------------------------------------------------------------------------------
//
// This file contains the compute intensive kernels for the Householder transformations.
// It should be compiled with the highest possible optimization level.
//
// On Intel Nehalem or Intel Westmere or AMD Magny Cours use -O3 -msse3
// On Intel Sandy Bridge use -O3 -mavx
//
// Copyright of the original code rests with the authors inside the ELPA
// consortium. The copyright of any additional modifications shall rest
// with their original authors, but shall adhere to the licensing terms
// distributed along with the original code in the file "COPYING".
//
// Author: Alexander Heinecke (alexander.heinecke@mytum.de)
// Adapted for building a shared-library by Andreas Marek, MPCDF (andreas.marek@mpcdf.mpg.de)
// --------------------------------------------------------------------------------------------------
#include
"config-f90.h"
#include
"config-f90.h"
#include
<x86intrin.h>
#include
<x86intrin.h>
...
...
src/elpa2_kernels/elpa2_kernels_real_sse_6hv_single_precision.c
View file @
5d2e456f
...
@@ -56,10 +56,8 @@
...
@@ -56,10 +56,8 @@
// with their original authors, but shall adhere to the licensing terms
// with their original authors, but shall adhere to the licensing terms
// distributed along with the original code in the file "COPYING".
// distributed along with the original code in the file "COPYING".
//
//
// Author: Alexander Heinecke (alexander.heinecke@mytum.de)
// Author: Andreas Marek, MPCDF, based on the double precision case of A. Heinecke
// Adapted for building a shared-library by Andreas Marek, MPCDF (andreas.marek@mpcdf.mpg.de)
//
// --------------------------------------------------------------------------------------------------
#include
"config-f90.h"
#include
"config-f90.h"
#include
<x86intrin.h>
#include
<x86intrin.h>
...
...
Write
Preview
Supports
Markdown
0%
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment