experchange > fortran

gah4 (01-27-19, 07:15 AM)
On Saturday, January 26, 2019 at 10:03:10 AM UTC-8, Spiros Bousbouras wrote:
> On Sat, 26 Jan 2019 08:17:32 -0800 (PST) spectrum <septc> wrote:


(snip)
> > Again, I imagined that only padding is allowed in a C struct
> > (but I'm not sure at all).
> > But, as often the case, the C standard might say nothing
> > about such "concrete" thing :)


> It does. Paragraph 15 of "6.7.2.1 Structure and union specifiers" says


> Within a structure object, the non-bit-field members and the units in
> which bit-fields reside have addresses that increase in the order in
> which they are declared. A pointer to a structure object, suitably
> converted, points to its initial member (or if that member is a
> bit-field, then to the unit in which it resides), and vice versa.
> There may be unnamed padding within a structure object, but not at
> its beginning.


C required that a pointer to the struct equal a pointer to the first member.

As noted, also that the members are in the same order.

Some programs depend on the offsets of one struct equal those of
a struct with the same members at the beginning, but more following.

(Note, for example, that Fortran at least back to Fortran 66
requires this for blank common.)

If one wanted to be absolutely sure, one could build up a struct
with successive nested structs, but most don't do that.

Note also that C doesn't have a way to take a compiled declaration
from one routine to another. Structs with the same structure,
but separately declared, have to interoperate.

As well as I know, Fortran does not require that.
gah4 (01-27-19, 07:22 AM)
On Saturday, January 26, 2019 at 7:53:24 AM UTC-8, Steve Lionel wrote:

(snip)

> A correction - compilers are allowed to insert padding for SEQUENCE
> types. The compilers I am familiar with don't do so by default.


As well as I know it, it was believed, right or not, that
Fortran 66 required no padding in COMMON. Some arrangements
of variables, then, resulted in misaligned data.

Some systems would trap the alignment exception, copy the
data, perform the original operation, copy the results (if needed)
back again. This is usually slow.
michaelmetcalf (01-27-19, 11:22 AM)
On Saturday, 26 January 2019 01:28:17 UTC+1, spectrum wrote:
> PS. Here, I mean the section in Modern Fortran Explained (with the green cover).
> > Up to now, I have read the section B.2.1 ("Storage units"), which explains...


Having followed this discussion I can only point out that the section cited is in the chapter on what we call deprecated features, and that the last line reads "All use of storage association is error prone and we do not recommend it."!

Regards,

Mike Metcalf
JCampbell (01-27-19, 11:25 AM)
I modified the example above to include double precision z.
This modified example creates the possibility of padding between "x" and "z" for double precision alignment.
There is also the possibility of padding between records, after "y" and the next "x" for record alignment.

program main
implicit none
integer a( 1000 ), k
a = [( k, k=1,size(a) )]

call test( a, size( a ) / 6 )
end

subroutine test( pairs, np ) !! implicit interface
implicit none
type IntPair
sequence
integer x
double precision z
integer y
end type
integer :: np, ip, ii(2)
type(IntPair) :: pairs( np )

ii = transfer ( pairs( 1 )%z, ii)
print *, "first pair:", 1, pairs( 1 )%x, ii(1), pairs( 1 )%y
do ip = 2,3
ii = transfer ( pairs( ip )%z, ii)
print *, "next pair:", ip, pairs( ip )%x, ii(1), pairs( ip )%y
end do
end

gFortran Ver 7.3 runs and indicates the use of this padding.
A Type mismatch warning is provided for the inconsistent argument list.
Removing "sequence" gives the same padding result.

FTN95 Ver 8.2 runs and indicates that no padding is provided.
An type mismatch warning is provided for the inconsistent argument list.
Removing "sequence" gives an error that sequence is required and the program does not run.
gah4 (01-27-19, 12:41 PM)
On Sunday, January 27, 2019 at 1:22:13 AM UTC-8, michael...@compuserve.com wrote:

(snip)
> Having followed this discussion I can only point out that the section cited is in the chapter on what we call deprecated features, and that the last line reads "All use of storage association is error prone and we do not recommend it."!


How about storage association between a BIND(C) structure, and a C struct?
spectrum (01-27-19, 07:12 PM)
> the section cited is in the chapter on what we call deprecated features,

Thanks very much, this is exactly what I missed to notice yesterday...
(B. __Deprecated__ Features). I have been using the Kindle version + keyword search
+ jump-read, so reading only my interested parts locally. And the meaning of
"deprecated" is also given at the top of that section (i.e., not-recommended-by-authors
items).

This section also seems to include not only old features, but some new features
like B.10.5 (one can omit "subroutine", "function", etc, after "end" but not
recommended by the authors).

# FWIW, The next section is "C. Obsolescent Features".

> FTN95 Ver 8.2 runs and indicates that no padding is provided.


It is interesting that different compilers behave differently (within the specification).
I will try with some other compilers in hand later.

> How about storage association between a BIND(C) structure, and a C struct?


(Though this is not pointed to me, just my opinion below...)

While BIND(C) is very useful, I feel the need for manually specifying the information
for it (e.g., the definition of a C struct on the Fortran side) is pretty error-prone
because we need to update such definition manually whenever the C code
is modified. I guess this is inevitable at the moment for such "foreign function interface",
but I hope, someday, that such FFI will become more safe and automatic among
different languages (though it seems far from trivial...)

My one big question is how other people manage such FFI (between C/C++ and Fortran)
in a "safe" manner, but this is a different topic (so I should stop here).
Ron Shepard (01-27-19, 07:38 PM)
On 1/26/19 11:15 PM, gah4 wrote:
[...]
> Note also that C doesn't have a way to take a compiled declaration
> from one routine to another. Structs with the same structure,
> but separately declared, have to interoperate.
> As well as I know, Fortran does not require that.


I think that was the original intent of SEQUENCE, even before the C
interoperability features were added. Usually in fortran, there is a
single definition of a derived type, say in a module, and then that
module is USEd everywhere, either directly or indirectly, that the type
is used. But SEQUENCE was provided in order to allow separate
definitions with components of identical types and kinds to be
interoperable.

This might be used in a closed source library, for example, that is
written with its definitions of internal derived types and distributed
in only compiled object form. Then the user of that library could
compile a module that contains its own definitions of those derived
types and uses those in the appropriate interface blocks. If both the
internal definitions and those in the interface module agree in type and
kind and have SEQUENCE, this is supposed to work. Without SEQUENCE, it
is not guaranteed to work.

Of course, if the user has a different fortran compiler than the library
author, then that means that padding within SEQUENCE derived types must
agree. But that also means that all of the other information (calling
conventions, optional arguments, assumed shape arrays, and so on) must
also agree. Without those things being defined in a standard somewhere,
I've always wondered how that could possibly work.

On the other hand, if such a standard did exist, then it seems like it
would be a small extra step to standardize the format of module files,
and then the interface module itself would no longer be required, the
entire library could be distributed in binary form with its own
precompiled interoperable module files.

Later in f2003 when C interoperability was added to the language, the
SEQUENCE idea was extended to allow derived types in fortran to be
interoperable with structures in the companion C compiler. It has been
pointed out here many times that these features also allow
interoperability between different fortran compilers, with no actual C
code involved at all, provided they share the same companion C compiler.

$.02 -Ron Shepard
spectrum (01-27-19, 08:42 PM)
After searching the net more, I have come across an Intel-Fortran page
for the SEQUENCE statement. The description is very concise and clear (and precisely
what I wanted to know), but... (please see below).



--- (begin: excerpt) ---

SEQUENCE: Preserves the storage order of a derived-type definition.

The SEQUENCE statement allows derived types to be used in common blocks
and to be equivalenced.

The SEQUENCE statement appears only as part of derived-type definitions.
It causes the components of the derived type to be stored in the same sequence
they are listed in the type definition. If you do not specify SEQUENCE, the physical
storage order is not necessarily the same as the order of components in the type
definition.

--- (end: excerpt) ---

Given the possibility of padding (in general), isn't it safer to mention that caveat
in the description...?

I've tried the following code, which is a slightly modified version of the sample code
in the above page:

program main
implicit none

!DIR$ PACK:1
type num1_seq
sequence
integer(2) :: int_val
real(4) :: real_val
logical(2) :: log_val
end type
type num2_seq
sequence
logical(2) :: log_val
integer(2) :: int_val
real(4) :: real_val
end type
type(num1_seq) num1
type(num2_seq) num2
character*8 s1, s2

equivalence ( num1, s1 )
equivalence ( num2, s2 )

num1 % int_val = 2
num1 % real_val = 3.5
num1 % log_val = .TRUE.

s2( 1:2 ) = s1( 7:8 )
s2( 3:4 ) = s1( 1:2 )
s2( 5:8 ) = s1( 3:6 )

print *, num1 % int_val, num1 % real_val, num1 % log_val
print *, num2 % int_val, num2 % real_val, num2 % log_val

print *, sizeof( num1 )
print *, sizeof( num2 )
end

which gives (with both gfortran and pgi, no option)

2 3.50000000 T
2 0.00000000 T
12
8

while "gfortran -fpack-derived" (an option to pack a derived type as closely as
possible) gives

2 3.50000000 T
2 3.50000000 T
8
8

I guess that the description in the Intel page assumes "PACK: 1", and that
ifort will give the same result as above (which I cannot try right now...)

gah4 (01-27-19, 09:32 PM)
On Sunday, January 27, 2019 at 10:42:32 AM UTC-8, spectrum wrote:

(snip)
(begin: excerpt) ---

> SEQUENCE: Preserves the storage order of a derived-type definition.


> The SEQUENCE statement allows derived types to be used in common blocks
> and to be equivalenced.


This is the one I forgot.

As I noted above, it was believed (true or not) in Fortran 66 days,
and maybe also Fortran 77 days, that COMMON could not have padding.
Compilers would generate misaligned data, even on machines that didn't
have the ability to access such data without copying it first.

One reason for that requirement is EQUIVALENCE, for example between
REAL and DOUBLE PRECISION, where the latter is required to be twice
the size. One could reliably do such EQUIVALENCE and/or COMMON,
knowing how things would line up.

> The SEQUENCE statement appears only as part of derived-type definitions.
> It causes the components of the derived type to be stored in the
> same sequence they are listed in the type definition.
>If you do not specify SEQUENCE, the physical storage order is not
> necessarily the same as the order of components in the type definition.


Unless there is padding different in the two cases.
Ron Shepard (01-28-19, 12:29 AM)
On 1/27/19 11:12 AM, spectrum wrote:
> This section also seems to include not only old features, but some new features
> like B.10.5 (one can omit "subroutine", "function", etc, after "end" but not
> recommended by the authors).


I didn't check the context of this statement, but omitting the
subprogram type and name on an end statement is not a new thing, it is
the old pre-f90 thing. Adding the subprogram name, i.e. matching the end
statement to the appropriate subprogram, was introduced with f90. Unless
something has changed recently, adding the name is optional for external
subprograms (which I presume is the context of the above statement) and
required for module subprograms.

This is one of the editing changes that is required when you import
legacy code into modules. There is an emacs macro, bound to the TAB
character in f90-mode, that does this completion automatically. The
other change that is required for this is to remove the EXTERNAL
statements for the routines that used to be external but are now module
subprograms. That is not automatically done by emacs.

$.02 -Ron Shepard
Ian Harvey (01-28-19, 10:35 PM)
On 2019-01-28 07:59, Ron Shepard wrote:
[..]
> This is one of the editing changes that is required when you import
> legacy code into modules. There is an emacs macro, bound to the TAB
> character in f90-mode, that does this completion automatically.


The requirement for the subprogram "type" to appear in end statements
for module or internal subprograms was relaxed in F2008.

That is, F90-F2003:

MODULE m
CONTAINS
SUBROUTINE s
END SUBROUTINE
END MODULE

can now be F2008 on:

MODULE m
CONTAINS
SUBROUTINE s
END
END MODULE

making that aspect of the emacs macro redundant, bar coding style
requirements.

(The name on the corresponding end statement for a module or internal
subprogram (or anything END... really) has always been optional F90 on.)

Similar Threads