This is the Electronic Data Input Program (EDIP).

Overview of the EDIP system

EDIP means Electronic Data Input Program. It enables to input data into a spreadsheet program of your preference and then convert it to the standard SELFML files. The standard was developed by the Task Group on Standardization of Physico-Chemical Property Electronic Data Files.

If you are not much interested in API documentation, you can find the binary distribution at the Z. Wagner's SELFML page.

The full source code is available from SourceForge, package selfml.

The program is modular and works in two steps. First it examines the type of the spreadsheet used and calls the corresponding spreadsheet parser which converts the source to the internal XML document. In the second step the property type is determined and the corresponding property builder converts the internal document to the final SELFML file.

The internal XML document normally exists in memory only. However, it is possible to write it to the disk file instead of calling the property builder, mainly for debugging. This may be useful for developers who wish to see how the source file is parsed before the property builder is written.

On the other hand, the EDIP can build the SELFML file directly from the internal file. If your spreadsheet cannot save the file as XML and you will convert it to an XML file by other means, you can directly generate the internal file with the structure defined in the Spreadsheet Package Description.

EDIP does not perform any conversion by itself. It merely reads the input document, asks the spreadsheet parser to make the internal document and then asks the property builder to generate the final document. EDIP then writes this document to the file. Thus the system can only work if EDIP knows the correct spreadsheet parser and property builder. It is therefore necessary to generate the configuration file which contains necessary information. The user can write his/her own spreadsheet parser and/or property builder. In such a case the class must be installed using the procedure described below.

Usage

The EDIP is built upon the dom4j classes. You must have them in your CLASSPATH together with edip.jar or you must supply both of them in the -classpath or -cp command line option (depending on the Java implementation). The next version will also require the entity resolver available from Apache XML Commons.

EDIP is generally invoked by:

java java_options cz.cas.icpf.hroch486.selfml.Edip [function] [options]

The java_options can specify the path to the dom4j jar file and edip.jar unless they are defined in the CLASSPATH. The java options must define the location of the configuration file. It is achieved by setting the property:

-Dedip.config=/full/path/to/edipcfg.xml

It is not necessary to define the absolute path, it may be relative but you must then ensure that you will always invoke EDIP from the same directory. If the configuration file does not exist, the EDIP issues a warning and generates an empty file. It is written to the disk only if some change is made to it.

Most often you will not supply any function but just one or two file names. The first file name specifies the source to be converted, the second file name specifies the name of the SELFML file. If only one file name is given, the source file will be overwritten. However, the EDIP starts writing only if the data structure is correct. If an error in the source data is found, a message is displayed and the original source file is retained.

It is possible to specify function -internal followed by one or two file names. In this case the property builder is not invoked (EDIP even does not look for it) and the internal document is written to the output XML file. this is useful mainly for testing.

You can also choose one of the following functions:

-view
The function displays the contents of the configuration file. The spreadsheets parsers are displayed in the form:
namespace_uri :: root_element => classname
The property builders are displayed similarly as:
property_type => classname
-add classname [classname ...]
The function adds spreadsheet parsers and/or property builders by their fully qualified class names. the new configuration file is then written and displayed.
-remove classname [classname ...]
The function removes spreadsheet parsers and/or property builders by their fully qualified class names. the new configuration file is then written and displayed.