Original Weka Documentation

AstroWeka is based on Weka version 3.4, and is very similar in operation. Documentation on using Weka is available from the Weka website, which includes a very useful wiki.

Most of the information in the original Weka documentation applies to AstroWeka, this website mainly deals with the differences.

VOTable Conversion Issues

When loading VOTables there are a few things to take into consideration:

  1. All numerical data will be converted to double precision floating point values.
  2. Weka needs to be specifically told if an attribute is categorical.

Point one generally isn't a problem when working with data inside Weka, it only becomes a problem when Weka is used as part of a work flow involving other tools.For example, floating point numbers can't be reliably tested for exact equality, they have to be tested over an interval; if you try to process a VOTable using AstroWeka and then extract points based on a long id integer, you're going to have problems. Currently the best way to deal with ids is to edit the VOTable header so that they are red in as strings.

AstroWeka GUIs

AstroWeka has three graphical user interfaces, which extend Weka's GUIs with Virtual Observatory tools. They are accessed from the GUIChooser.

The AstroExplorer GUI

The AstroExplorer provides an interactive way to extracting data from AstroGrid and experimenting with different machine learning tools on them.

All of Weka's machine learning tools are available from the AstroExplorer, and can be accessed and configured using menus and forms. It provides the most convenient way to quickly set up and evaluate a machine learning task.

The Experimenter GUI

Often, finding the best learning scheme for a given task is a matter of trial and error. Several techniques will need to be tested with different parameters, and their results analyzed to find the most suitable one. The Experimenter is used to automate this process, it can queue up multiple machine learning algorithms, to be run on multiple data sets and collect statistics on their performance.

The Knowledge Flow GUI

The Knowledge Flow provides a work flow type environment for AstroWeka. It provides an alternative way of using AstroWeka for those who like to think in terms of data flowing through a system. In addition, this interface can sometimes be more efficient than the Experimenter, as it can be used to perform some tasks on data sets one record at a time without loading the entire set into memory.

Brian Walshe