Extended Attributes - what are they and how can you use them ?


One of the major features of OS/2 (first introduced in version 1.2) is
the installable file system.  This provides a standard way to support
different file systems under OS/2;  the most obvious example of such
a file system is the "High Performance File System" (HPFS) which is
supplied with OS/2.
The most glaring feature of HPFS over the traditional File Attribute
Table (FAT) file system (as used by DOS) is that long file names are
allowed, thus breaking the restrictive "8.3" format of DOS (and OS/2
1.1) which we are all familiar with.

However another feature contained in the installable file system interface
is that of the attributes of a file.  The original FAT file system design
did allow a few, predefined, binary attributes such as 'read only' or
'system', but with OS/2 1.20 this idea was extended to a more general
set of file attributes, which are therefore given the name of "Extended

Extended attributes are a property of directories as well as of files.

Not content with only supporting these attributes for the new installable
file systems OS/2 1.20 enhanced the original FAT structure to allow extended
attributes for disks using the FAT format as well as the newer HPFS format.

This was achieved by making use of previously reserved fields in the directory
entry for each file; and putting the EA data itself into a hidden file on
the root directory - named "EA DATA. SF".  This is a nice feature except
that, since is an enhancement to the traditional FAT structure, if the same
disk is accessed under DOS it is terribly easy to destroy or corrupt the EAs.
(This is particularly true of DOS-based backup programs which typically
cannot cope at all either with the reserved fields in the directory or with
the peculiarly named data file, and usually fail to backup either of these!)

The basic intention for extended attributes is to provide a mechanism to
attach named data items (of variable structure) to a file.  In order to
allow maximum flexibility to the use of EAs certain item names
were reserved for standard purposes - an example is the ".TYPE" extended
attribute which defines the file type, such as "Plain Text".
IBM recommended that other programs using EAs should include some unique
designator, such as the company name and product, in the item name to
avoid conflicts.

        Use of EAs by OS/2

Extended attributes were not heavily used under OS/2 1.x - the system editor
kept asking annoying questions when saving text files and you had to decide
whether your file was 'Plain Text', 'OS/2 Command File' or 'DOS Command File',
but for most users most of the time it didn't matter much.  If extended
attributes were sometimes lost by, for example, using non-EA aware programs
then it was usually not even noticed.

However extended attributes are rather more heavily used under OS/2 2.0 -
especially for the desktop. If you look at the root directory of your boot
disk you will see a directory such as "OS!2_21.0_D" (if it is a FAT disk) or
"OS/2 2.0 Desktop" (if it is an HPFS disk), under which are subdirectories
with names like "TEMPLATE" or "TOOLKIT" corresponding to folders on the
desktop.  When you look at these directories you may be puzzled by the lack
of files - for example on my machine I have 24 subdirectories of C:\OS!2_21.0_D
but a total of only 11 files in them!

The reason for this is that the desktop information is held in the extended
attributes for the directories themselves, and so most of the folders and their
contents can be described without requiring any additional files.
So beware - if you attempt to backup your desktop configuration by using
a program like XCOPY you must ensure that you copy even empty subdirectories.
[So that's why the /e option is there in XCOPY !]

Provided you stick with programs written for OS/2 1.2 and above you are
likely to have few problems with extended attributes.  However since one
of the strengths of OS/2 2.0 is its ability to run DOS programs there are
a likely to be problems to be overcome when accessing files with extended

The simplest solution for single files is to use the OS/2 utility program
EAUtil, shipped with OS/2, which allows you to split the extended attributes
out from the data file (or directory) into a file and to recombine them later.

So for example if you wanted to send a file with EAs via a bulletin board
using one of the many non-EA aware archive programs you could do the following:

        C:>EAUTil /s /p myfile myfile.ea  ; create copy of EAs in file
        C:>pkzip myfile myfile myfile.ea  ; ZIP both data file and EA file

and the recipient can then recombine the files to reconstruct the original
data file complete with EAs as follows:

        C:>pkunzip myfile                 ; extract myfile and myfile.ea
        C:>EAUTIL /j myfile myfile.ea     ; combine together into single file

The easiest way to see if a file on a FAT disk has EAs is to use the /n
on DIR which forces output to the 'new' HPFS output format.  The last but
one field is the size of the EAs.  For example:

C:>DIR /n c:\os!2_21.0_d

 Directory of C:\os!2_21.0_d

16-08-92  10:54p     <DIR>           0  .
16-08-92  10:54p     <DIR>           0  ..
16-08-92  10:54p     <DIR>         867  TEMPLATE
16-08-92  10:54p     <DIR>        3913  TOOLKIT

Note however that OS/2 does NOT provide a standard display tool for extended
attributes so it is not that easy to find out what the actual items are in the
extended attributes for a file.

    Overview of the APIs used for EAs

Extended attributes appear in the file system API from the very start -
when a file is created or replaced using DosOpen() extended attributes
can be specified.  (In much the same way a file can be created read only
or hidden.)

Once a file is created extended attributes can be queried and set using
DosQueryFileInfo/PathInfo and DosSetFileInfo/PathInfo.  The 'File' functions
are used to access a file using a file handle and the 'Path' functions are
used to access a file without opening it first, or a directory (since
directories cannot be opened using DosOpen the 'File' functions cannot be
used on them.)

These APIs require a (rather complicated) structure containing a list of
EA names and values, and are used to access explicit EAs with known names.

For more general requests for specific item name on a set of files
the DosFindFirst and DosFindNext APIs (with info level of 2 or 3) can be
used to enumerate matching files and extract named EAs.  This is roughly
equivalent to a combination of the simple (viz info level 1) DosFindFirst/
DosFindNext (to get the file names), together with DosQueryPathInfo (to get
the EA information), but in a single call.  It will not be discussed further
in this article for this reason.

Finally for general information about extended attributes the DosEnumAttribute
call can be used to enumerate the entire set of extended attributes for a file.

The API functions themselves seem relatively sensible - they allow creation
of a file with extended attributes, querying and setting attributes for a
file or directory and enumerating the complete list of extended attributes.

There are a few problems - the main one being that there is no fail-safe way
of obtaining the size of a single EA (which you might like to do in order
to allocate a buffer of the correct size into which to read it!).

This is because, unlike most other OS/2 API calls, if the buffer provided
on a DosQueryXXX call is too small the buffer length is set to the size of
the ENTIRE EA SET FOR THE FILE rather than the (rather more useful) size
of the actual EAs you require!

The only API which will return the size of each EA is DosEnumAttribute, but in
order to guarantee consistent results (since theoretically other programs could
alter the file's attributes between calls to DosEnumAttribute) the programmer's
reference manual itself recommends first opening the file in deny-write mode.
Unfortunately (a) this is not always desirable and (b) this is no use at all
for directories - which cannot be opened!

There are two common ways of resolving this problem: method 1 is to use
DosEnumAttribute() and hope the results are consistent, method 2 is to
allocate a really big buffer so the EA being read is 'bound to fit'.
Neither way strikes me as desirable in a professional operating system!

However the real problems with using EAs come with the data types which have
been defined - both for accessing EAs and the format of the data itself.

    Overview of the data types used for EA access

In my opinion they are a mess.

In fact I think EA actually stands for 'extremely awkward' based on the
problems experienced when you try using them.  This article itself was
sparked off by discovering sample code for accessing EAs which could
create extended attributes which the same code was unable to read - if
it's that hard to write a sample program what hope do we have in using
EAs in real programs ?!

First the access data types, as used in DosQueryFileInfo for example.

It all starts with an EAOP2 structure, which basically contains nothing
but pointers to two further structures: a GEA2LIST and a FEA2LIST.

Both structures are used for query type of operations: the GEA2LIST
contains a list of the names of the EAs required, and the FEA2LIST
points to a buffer which is to contain the actual EA data.

Only the FEA2LIST is used for set type of operations: the GEA2LIST
is ignored.

The GEA2LIST and FEA2LIST both consist of a header (a total buffer length)
followed by an 'array' of variable sized data structures.  Each data
structure in turn contains a 'offset to next' field, the length of
the EA name and the name itself.  The FEA2 structure also contains a
flag byte and then (finally!) the actual EA data itself.

All clear so far ?  To make it a bit easier here is a schematic diagram
of an EAOP2 request buffer after requesting two EAs:

| GEA2LIST pointer    | FEA2LIST pointer    | 0 (no error offset) |
      |                     |
      |                     |
      V    first GEA2       |                  second GEA2
| length | offset  |  EA name | EA name  |pad|   0 (no | EA name | EA name    |
| of list| to next |  length  | + NUL    |   |   next) | length  | + NUL      |
      V    first FEA2
+--------+---------+-------+----------+----------+----------+------------ - -
| length | offset  |  flag |  EA name | data item| EA name  | data item
| of list| to next |  byte |  length  |  length  |          | itself
--------+---------+-------+----------+----------+----------+------------ - -

             second FEA2
  - - --+---+-------+------+---------+----------+-----------+-----------+
        |pad| 0 (no | flag | EA name | data item| EA name   | data item |
        |   | next) | byte | length  |  length  |           | itself    |
  - - --+---+-------+------+---------+----------+-----------+-----------+

(I hope the picture is worth a thousand words in showing the relationship of
the various structures and fields)

Since both the EA names AND the data item are of variable length, this sort
of structure is hard to manipulate using C - and that's without touching
the actual format of the EA data item.

    Overview of the data formats of EA data

OS/2 recommends but does not impose a standard format scheme for EA data.

Firstly the names of EAs starting with '.' are reserved for system EAs, of
which .TYPE (the file type) and .CLASSINFO (SOM class information) are

Secondly there are a number of standard formats each consisting of an EA type
byte followed by type specific data.

(1) Simple data types - which all begin with a 2 byte length then the data:
    EAT_BINARY (binary data), EAT_ASCII (ASCII text),
    EAT_BITMAP (bitmap), EAT_METAFILE (OS/2 metafile),
    EAT_ICON (icon)

    A special case of these is EAT_EA which contains the name of another EA
    containing further data.  This provides, among other things, a way of
    generating EA data of more that 64K; which is the limit for a single
    EA data item.

(2) Headers for more complicated data types:
    EAT_MVMT which defines a multi-valued, multi-typed field such as is used
    for the .COMMENT EA (there may be multiple comments of different data
    types for a single file),
    EAT_MVST which defines a multi-valued single-type field (as a simplification
    of the MVMT type when all items have the same type),
    EAT_ASN1 which defines an ASN.1 ISO standard multi-valued data stream (I
    have never seen an example of this one 'in the wild' but I expect someone
    uses it!)

    Just to make life REALLY interesting a multi-valued field can include
    multi-valued subfields as well as simple data types.

(3) In addition the values 0 to 0x7fff are reserved for user-defined types.

This flexibility makes it impossible to write general EA display programs
    (a) user defined EAs follow no rules at all
    (b) even the 'standard' EAs are interpreted differently by different
    (c) OS/2 does no checking of the format of EA data items when writing
        them to disk.

However despite this extended attributes can be useful, but please bear
the above problems in mind when coding - especially if you ever write code to
process multi-valued EAs!

    Description of the sample program

Given the problems described in the overviews above I thought that a
nice simple example program would perhaps encourage more OS/2
programmers to venture into the area.

The example I have used is restricted to the simple single-valued ASCII
data type, such as is used for the .LONGNAME or .VERSION standard EAs.

This data type can be used for your own files - for example to attach a quick
textual note to a file such as a README attribute describing the file,
or a note of when it was last backed up!

Since it is hard to manipulate the data structures used for EA access I
decided to write a couple of access functions:

        EAQueryString() - to read a NUL terminated string EA
        EASetString() - to write a NUL terminated string EA

Obviously this method could be extended to cover the standard data types,
and provide a more 'programmer-friendly' interface.

The program itself merely calls the appropriate function to read or
write the named EA.

Note that opinion among OS/2 programs appears divided over the question of
whether or not the ASCII data item includes the trailing NUL character or
not - I prefer removing it since the string length is defined by the 2 byte
length following the EAT_ASCII byte, but other programs leave the NUL in

It is a good idea to process either format, whichever one your
programs will actually generate!

EADemo expects two or three arguments.  The first argument is the file (or
directory) name and the second is the name of the EA item required.
If there is a third argument it is the value to set the EA item to; if there
is no third item the program merely displays the current value of the EA item.

Note that since OS/2 does not provide an explicit API to delete an extended
attribute EADemo takes a zero-length string to imply deletion.

The programs are compiled as follows.
(I am using IBM Set C/2)

        icc /c EAString.c
        icc EADemo.c EAString.obj

Then for example:

        C:>echo. > sample

        C:>EADemo sample read.me "A simple test of the program"
        Value of EA item read.me set to: "A simple test of the program"

        C:>EADemo sample read.me
        Value of EA item read.me is: "A simple test of the program"

        C:>EADemo sample read.me ""
        EA item read.me deleted

    Comments on the program

        It is a little longer than I usually hope for in articles of
        this type - partly reflecting the difficulties referred to above
        in the way the API has been implemented.  I have liberally
        commented the code, rather than writing a large amount of separate
        description, in hope of the providing a more useful working example.

        EAString.c is a general purpose piece of code to read and write ASCII
        EAs.  It does not make efficient use of memory since every request
        malloc's and free's a buffer.  In addition it relies on being told
        how big a string to read data into, but it suffices for simple use.

        The EAOP2 structure is only used inside the EAQueryData and
        EASEtData functions.  I do not find it a useful structure
        when programming as it adds so little information to the underlying
        GEA2LIST and FEA2LIST structures.

        Note that OS/2 2.0 will round the size of the buffer up to the
        NEXT DOUBLEWORD BOUNDARY so make sure that you pick a buffer length
        divisible by four (or allocate 4 bytes more than the length you
        said) - see the comment in EAQueryString().

        The EADemo program attempts to display a text message on any error
        by loading the appropriate error message from OSO001.MSG.  Note
        however that error 111 (ERROR_BUFFER_OVERFLOW) which is generated
        by OS/2 and EAString when is the buffer is too small to hold the
        EA data requested is interpreted in the message file as
        "SYS0111: the file name is too long" rather than a more relevant
        message referring to buffer sizes!
        You may prefer to 'lie' and map error 111 to another error code
        such as error 122 (ERROR_INSUFFICIENT_BUFFER) which has a more
        meaningful text associated with it.


Extended attributes are a nice idea but I believe they are spoilt by the
poor interface.  Hopefully over time IBM themselves will address this and
provide a more usable interface - in the mean time writing simple functions
to perform one task (as this article demonstrates) can make it considerably
easier to add basic EA functionality to your programs by hiding the complexity
of the interface inside various access functions.

------------------------- EAString.h --------------------------

/* Header file for EAString.c module */

APIRET EAQueryString( PSZ pszPathName, PSZ pszEAName, USHORT cbBuf, PSZ pszBuf );
APIRET EASetString( PSZ pszPathName, PSZ pszEAName, PSZ pszBuf );


------------------------- EAString.c --------------------------

/* EAString.c - functions to read and write single-valued ASCII EA */

#define INCL_DOS
#include <os2.h>

#include <stdlib.h>
#include <string.h>

#include "EAString.h"

#pragma pack(1)

/* Header for a single-valued ASCII EA data item */
typedef struct _EA_ASCII_header
   USHORT usAttr;                 /* value: EAT_ASCII                        */
   USHORT usLen;                  /* length of data                          */
                                  /* ASCII data fits in here ...             */

#pragma pack()

/* EAQueryData: query EA data using supplied 'get' EA list into supplied     */
/*              'full' EA buffer - which need NOT be initialised first       */

static APIRET EAQueryData( PSZ pszPathName, PGEA2LIST pGEA2List,
                           ULONG cbBuf, PFEA2LIST pFEA2List )
   EAOP2 eaop2 = { NULL, NULL, 0 }; /* EA 'root' data structure              */

   eaop2.fpGEA2List = pGEA2List;
   eaop2.fpFEA2List = pFEA2List;
   pFEA2List->cbList = cbBuf;     /* Inform OS/2 how big our FEA2List is     */

   return DosQueryPathInfo( pszPathName, FIL_QUERYEASFROMLIST,
             (PBYTE) &eaop2, sizeof ( eaop2 ) );

/* EASetData: set EA data using supplied 'full' EA buffer                    */

static APIRET EASetData( PSZ pszPathName, PFEA2LIST pFEA2List )
   EAOP2 eaop2 = { NULL, NULL, 0 }; /* EA 'root' data structure              */

   eaop2.fpFEA2List = pFEA2List;

   return DosSetPathInfo( pszPathName, FIL_QUERYEASIZE,
             (PBYTE) &eaop2, sizeof ( eaop2 ), DSPI_WRTTHRU );

/* EAQueryString: query EA ASCII data into a supplied buffer as a NUL        */
/*                terminated string.                                         */
/*                                                                           */
/* Note: the NUL terminator is NOT required in the data itself - it will be  */
/* added if required.  (Some ASCII EAs include a NUL, some don't !)          */

APIRET EAQueryString( PSZ pszPathName, PSZ pszEAName, USHORT cbBuf, PSZ pszBuf )
   APIRET rc = ERROR_NOT_ENOUGH_MEMORY; /* return code                       */
   PFEA2LIST pFEA2List = NULL;    /* pointer to returned EA data             */
   PGEA2LIST pGEA2List = NULL;    /* pointer to list of EAs requested        */
   PEA_ASCII_HEADER pEAData = NULL; /* pointer to data item itself           */
   size_t GEAlen = 0;             /* length of GEA list                      */
   size_t FEAlen = 0;             /* length of FEA list                      */
   PSZ pszAscii = NULL;           /* pointer to ASCII data itself            */

    * Build an FEA2 list buffer with enough space for cbBuf data items
    * The length is obtained by:
    *     size for FEA2LIST header and one FEA2 item
    *   + room for the EA name (the NUL is included in size of FEA2! )
    *   + EAT_ASCII header
    *   + up to cbBuf bytes of EAT_ASCII data (may or may not end with a NUL)
   FEAlen = sizeof( FEA2LIST ) + strlen( pszEAName ) +
               sizeof( EA_ASCII_HEADER ) + cbBuf;

   /* FEAlen MUST be rounded up to a doubleword boundary since
      OS/2 may use buffer space up to this boundary */
   FEAlen = ( ( FEAlen + 3 ) / 4 ) * 4;

   pFEA2List = (PFEA2LIST) malloc( FEAlen );
   if ( pFEA2List != NULL )
       * Build a GEA2 list for the EA we require
       * The length is obtained by:
       *        size for GEA2LIST header and one GEA2 item
       *      + room for the EA name (the NUL is included in the size of GEA2 !)
      GEAlen = sizeof( GEA2LIST ) + strlen( pszEAName );
      pGEA2List = (PGEA2LIST) malloc( GEAlen );
      if ( pGEA2List != NULL )
         pGEA2List->cbList = GEAlen;
         pGEA2List->list[0].oNextEntryOffset = 0;
         pGEA2List->list[0].cbName = (BYTE)strlen( pszEAName );
         strcpy( pGEA2List->list[0].szName, pszEAName );

         rc = EAQueryData( pszPathName, pGEA2List, FEAlen, pFEA2List );
         if ( rc == 0 )
            if ( pFEA2List->list[0].cbValue == 0 )
               /* THere is no data for this EA, return an error */
               rc = EA_ERROR_NOT_FOUND;
               /* Verify the data type is what we're expecting */
               pEAData = (PEA_ASCII_HEADER) ( (PSZ)pFEA2List->list[0].szName
                            + pFEA2List->list[0].cbName + 1 );
               if ( pEAData->usAttr == EAT_ASCII )
                  /* skip ASCII header to point to ASCII data */
                  pszAscii = (PSZ) (pEAData + 1);

                  /* If a trailing NUL is present, ignore it */
                  if ( pszAscii[ pEAData->usLen - 1 ] == '\0' )

                  if ( pEAData->usLen < cbBuf )
                     /* Give the user the data as a NUL terminated string */
                     memcpy( pszBuf, pEAData + 1, pEAData->usLen );
                     pszBuf[ pEAData->usLen ] = '\0';
                     /* data read is too long for user's buffer */
                     rc = ERROR_BUFFER_OVERFLOW;
                  /* This function only processes EAT_ASCII ! */
                  rc = EA_ERROR_WRONG_TYPE;

         free( pGEA2List );

      free( pFEA2List );

   return rc;

/* EASetString: set EA ASCII data from a NUL terminated string               */
/*                                                                           */
/* Note1: the NUL terminator is NOT stored since the EAT_ASCII type already  */
/* includes a length field.                                                  */
/* Note2: setting a string consisting only of the NUL character will delete  */
/* the EA.                                                                   */

APIRET EASetString( PSZ pszPathName, PSZ pszEAName, PSZ pszBuf )
   APIRET rc = ERROR_NOT_ENOUGH_MEMORY; /* return code                       */
   PFEA2 pFEA2 = NULL;
   size_t len = 0;
   size_t cbBuf = 0;

   /* Build an FEA2LIST buffer of the right size (see EAQueryString above) */
   len = sizeof( FEA2LIST ) + strlen( pszEAName );
   cbBuf = strlen( pszBuf );
   if ( cbBuf != 0 )
      len += sizeof( EA_ASCII_HEADER ) + cbBuf;

   pFEA2List = (PFEA2LIST) malloc( len );
   if ( pFEA2List != NULL )
      pFEA2List->cbList = len;

      pFEA2 = pFEA2List->list;
      pFEA2->oNextEntryOffset = 0; /* no more fields                         */
      pFEA2->fEA = 0;             /* no flags                                */
      pFEA2->cbName = (BYTE) strlen( pszEAName );
      strcpy( pFEA2->szName, pszEAName );

      if ( cbBuf == 0 )
         pFEA2->cbValue = 0;      /* this will delete the EA!                */
         pFEA2->cbValue = (USHORT)( sizeof( EA_ASCII_HEADER ) + cbBuf );

         /* Fill in the EA data area using an ASCII EA template */
         pEAData = (PEA_ASCII_HEADER) ( (PSZ)pFEA2List->list[0].szName
                      + pFEA2List->list[0].cbName + 1 );
         pEAData->usAttr = EAT_ASCII;
         pEAData->usLen = (USHORT) cbBuf;
         memcpy( pEAData + 1, pszBuf, cbBuf );

      rc = EASetData( pszPathName, pFEA2List );

      free( pFEA2List );

   return rc;

------------------------- EADemo.c ----------------------------

/* EADemo.c - program to read or write a single-valued ASCII EA */

#define INCL_DOS
#include <os2.h>

#include <stdio.h>
#include <stdlib.h>

#include        "EAString.h"

/* printrc: print an explanatory message for an OS/2 (or user-defined)       */
/*          error code                                                       */

static void printrc( APIRET rc )
   CHAR pchBuf[512] = {'\0'};
   ULONG ulMsgLen = 0;
   APIRET ret = 0;                /* return code from DosGetMessage()        */

   ret = DosGetMessage( NULL, 0, pchBuf, 512, rc, "OSO001.MSG", &ulMsgLen);
   if (ret == 0)
      printf( "%.*s", ulMsgLen, pchBuf );
   else if ( rc == EA_ERROR_NOT_FOUND )
      printf( "EA item was not found" );
   else if ( rc == EA_ERROR_WRONG_TYPE )
      printf( "EA data is not simple ASCII" );
      printf( "OS/2 error code: %u", rc );

/* M A I N   P R O G R A M                                                   */

int main( int argc, char **argv )
   APIRET rc = 0;
   char szBuf[ 256 ] = {'\0'};    /* arbitrary max length for buffer         */

   if ( ( argc != 3 ) && ( argc != 4 ) )
      printf( "Syntax: EADemo <file> <EAname> [text]\n" );
      exit ( 1 );

   if ( argc == 3 )
      rc = EAQueryString( argv[1], argv[2], sizeof( szBuf ), szBuf );
      rc = EASetString( argv[1], argv[2], argv[3] );

   if ( rc != 0 )
      printf( "Unable to access EA item %s\n", argv[2] );
      printrc( rc );
      printf( "\n" );
      if ( argc == 3 )
         printf( "Value of EA item %s is: \"%s\"\n", argv[2], szBuf );
      else if ( argv[3][0] == '\0' )
         printf( "EA item %s deleted\n", argv[2] );
         printf( "Value of EA item %s set to: \"%s\"\n", argv[2], argv[3] );

   return (int) rc;

                                                                   Roger Orr

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License