NCSA
emerge@ncsa.uiuc.edu

Grunk Documentation

Overview

Grunk is a Java class library for extracting structured metadata from semi-structured text formats. It's designed to make it easy to rapidly develop applications which extract information from undocumented, unsupported, or just difficult to use text file formats. Although it can be used from the command line, it is also designed to be integrated into other applications.

Installation

You can get the installer here. Save it to your local file system. To run it under Java 1.2 or higher, just issue java -jar grunk_in.jar and follow the instructions. Be sure to read any help. If you need to install it using a lower version of Java, issue jre -cp grunk_in.jar; ncsa.emerge.grunk.installer.GrunkInstall, replacing the ";" with the path separator for your platform (e.g. ":" under unix).

If you wish to check that grunk has installed correctly, switch to the directory where you installed your scripts (the default is ..\grunk\scripts and issue

grunk --help (Windows, OS/2)

sh grunk --help (unix)

This should spit out the basic help message from grunk listing its command options.

Reference

Here's a brief description of the API, and here's the javadoc API documentation for Grunk.

Here's a reference manual for the Grunk configuration format. There's also a simpler configuration format (suitable for many typical uses) called Grunk Lite.

Tutorial

Here's a tutorial showing how to configure Grunk to parse a several example file formats, and here's one for Grunk Lite.