Public:MDF Format

From Mesonet Wiki

Revision as of 14:41, 19 June 2000; Brad (Talk | contribs)
(diff) ←Older revision | Current revision | Newer revision→ (diff)
Jump to: navigation, search

Contents

tn009: Uncompressed MDF/MTS specification

Image:Divider.gif

original: 2/1/99 author: Brad

1. Introduction

Uncompressed Mesonet data files (MDF) and Mesonet time series (MTS) files were the first type of files generated for mass distribution to customers. Their tabular layout is excellent for both printing and importing into spreadsheets and/or word processing documents. The specification for this particular kind of MDF file has undergone considerable modification since it's inception, but generally holds true to it's original format.


2. The Format

Uncompressed MDF files should only contain standard, printable ASCII characters. Line breaks can be either DOS (carraige return and line feed characters), Mac (carraige return character only), or Unix (linefeed character only) format, but should be consistent. Uncompressed files should be transmittable in FTP or HTTP text mode without character conversions, aside from the line breaks.

An uncompressed MDF file consists of four key elements:

  1. A version number which identifies the general format as well as whether the file is compressed or uncompressed.
  2. Number of parameters and base time stamp.
  3. A list of parameter identifiers. The first three parameter identifiers (STID, STNM, and TIME) do not count in the number of parameters listed in the previous item.
  4. A set of data records for each station (MDF) or time (MTS).

----

2.1 The Version Number

This is a version number line from an uncompressed MDF file downloaded from the Mesonet BBS:

101 !Copyright (c) 1995 Oklahoma Climatological Survey.

In an uncompressed MDF file, the version number is always 101 since there have been no modifications of the uncompressed MDF format since the use of version numbers began. MDF and MTS files are defined internally to have an initial version of 100. Another number is added to indicate what format the MDF or MTS file is created with, in this case the number is 1, which indicates that it is an uncompressed version of the file. Compressed MDF/MTS files are currently labeled with an even number (currently 102). If the file formats were ever modified, the version numbers on both files would be upgraded making sure that the uncompressed version was odd (e.g. 103, 105, 139, etc.) and the compressed version was even (e.g. 104, 106, 140, etc.). The key to remember is odd numbers indicate uncompressed files while even numbers indicate compression was used.

The copyright statement is ignored by our current software since it is superfluous to data gathering. Currently there are no plans to parse or display any portion of the copyright statement. It is there solely to protect OCS and OU intellectual property.

2.2 The number of parameters and time stamp line

The next line contains the number of parameters and the time stamp, all delimited by spaces:

  5 1994 07 07 00 00 00

The first number (13) indicates the number of measured parameters that can be found in the MDF or MTS file. This number does not include the first three parameters listed in the parameter identifier line or in the data records (station identifier, station number, and time).

The other numbers in the line indicate year, month, day, hour, minute, and second respectively. This time is a base time for the file and may or may not represent any actual time used in the data itself. More on this later.


2.3 The parameter identifier line

The next line in an uncompressed MDF/MTS file contains a list of parameter identifiers delimited by spaces:

 STID   STNM    TIME    RELH    TAIR    WSPD    WVEC    WDIR

This line indicates all the parameters stored in the data records. The first three identifiers in the parameter identifiers line must be STID, STNM, and TIME, and they must be in that order. They are not counted in the count of parameters used in the MDF file. Therefore, the number of identifiers is always 3 more than the number of parameters indicated in the number of parameters and time stamp line.

Parameter IDs must be four characters, and have always been separated by four spaces (except for STID and SNUM, which are separate by three spaces). Most of our software is dependent on the four-character restriction, and it's possible that some older software is dependent on the spacing.


2.4 The data records

The last section of an MDF/MTS file contains a line of data for each station/time stored in the file:

 ADAX     1  1020     57   32.8    8.4    8.1   188
ALTU     2  1020     45   35.1    5.4    5.3   271
ALVA     3  1020     70   25.0    9.1    9.1    51
ANTL     4  1020     71   31.5    4.6    4.4   186

...

All parameter values are delimited by space characters and are right justified (aligned by the rightmost character) so all data values will line up in a tabular format. This was done to facilitate both viewing and parsing. If the values are not right justified and space delimited, the current software will not be able to correctly parse the data, resulting in no output or a possible crash.

The station identifier (STID) is always a text value, meaning it is stored as a string in a data file. While it may contain numbers, or even be entirely numeric, it is still parsed as a literal string. Once again, this parameter is not counted among the number of parameters indicated by the number of parameters line.

The station number (STNM) is an integer which identifies a station within a certain network. This numeric identifier may be used to obtain station information from a network resource. Once again, this parameter is not counted among the number of parameters indicated by the number of parameters line.

The time field (TIME) is an integer value which indicates the number of minutes past the base time. For example, the base time indicated above shows a 0 hours, 0 minutes, and 0 seconds as the base time for the file, which translates into 00:00:00 UTC. The data record for ALVA above has an offset of 1020 minutes (17 hours) from the base time. This means that the data collected for the station identified by ALVA (which just happens to be Alva, OK) was measured at (00:00:00 + 17 hours) 17:00:00 UTC.

The base time can be non-zero, but it is the file creator's responsibility that the base time and the value in the TIME field for each record add up to represent the true time of the data in the record.

In an MDF file the station identifier and station number change, but the time offset does not vary among the records. In an MTS file, the station identifier and number do not vary, but the time offset does.

The other fields in a record indicate the actual recorded value of the parameters in the parameter identifier line. The units of each parameter class are given in Table 2.1.

Parameter Class

Unit

Temperature

degrees centigrade

Humidity

percent

Velocity

meters per second

Direction

degrees

Pressure

millibars

Precipitation

inches

Radiation

watts per square meter

Table 2.1. Units of measure for MDF and MTS files.


For the example data above, at ALVA the temperature at 1.5 meters above ground (TAIR) is 25 degrees centigrade, the relative humidity (RELH) is 70 percent, the measured atmospheric pressure (PRES) is 955.73 millibars, etc. Remember that all values are right justified and space delimited and all data records are of a constant length in the MDF or MTS file.


3. Handling Missing Data

Whenever a record has no value for a certain parameter, for any reason, a missing data value is substituted for that value. This missing data value must be less than -900. Any value greater than -900 (e.g. -888 or -777) will be interpreted as an actual data value by the current revisions of the software. This could lead to undesirable results.

Currently, there are four missing data values that are defined explicitly by OCS:

Missing Data Value

Defined as:

-999

Data flagged as bad by quality assurance routines.

-998

No sensor for this parameter at this station.

-997

Sensor temporarily off-line

-996

Station did not report

Table 3.1. Missing data values and their meaning.


Future revisions of the MDF/MTS specification may use further codes to represent new missing data definitions.

It is not necessary to conform to these missing data codes when using user-supplied or massaged data, but be aware that the software will most likely interpret the above values as such.


4. Conclusion

The uncompressed MDF and MTS format is the one most used by those wishing to supply their own data for certain parameters as it lends itself to both use in software and for viewing in text editors and/or spreadsheets. The current revisions of the plugins (3.1 as of this writing) allow for user-defined parameter identifiers and values, however conversion and units are not supported for these user-defined identifiers. Therefore, if one wishes to plot some data using software developed within OCS, it is highly recommended to use the uncompressed MDF/MTS specification.

Return to Tech Notes

Image:Divider.gif

Personal tools
OCS / Mesonet Websites