groupKey: a single column key or a tuple of column keys that
define the groups via the Table.levels method or a callable
that accepts a record and returns the level for this record
or None indicating that this record should be excluded from
the groups
Note that the groups are formed by appending each record in the input
table table to exactly one (freshly instantiated) group table. The
number of group tables is defined by the number of levels induced
by groupKey.
aggregates the groups in this table group using the common column key
aggregationKey. For each group, aggregationKey is passed to the
Table.__getitem__ method of the corresponding Table instance, and
the result is passed to the statistics callable.
Value:
a dictionary ::
{ <level> : <aggregated quantity> }
implements containment. Returns true if the values for all column keys
in self that are also in other are equal. Note that this does not
include the case where other defines additional keys not defined by
self. This way, the only presumption made about other that it
supports sequential access.
a record-oriented data table (or "data base") with variable
column data types
Detail:
Adds record-based operations to the base class.
Also adds row labels, which are perceived as cases. See
the new_table factory function for flexible creation of
tables.
Tables may contain different data types in different columns
(the behavior adopted here is very much derived from S-plus
frames). The data are kept as NumPy object arrays and converted
to the corresponding column type only when necessary (and if
possible).
The following column type codes are recognized:
"i": Integer
"f": for Float
"c": character (string)
"b": boolean
"D": date/time
ToDo:
make skip() honor the current index (i.e., have the sequence of
records be defined by the current index)
returns the record (key is an integer), column (key is a string),
element (key is a 2-tuple), or sub-table (key is a slice, a tuple,
or a list) specified by key.
If key is a slice, it will retrieve along rows (integer values for
start and stop) or along columns (string values for start and stop).
If key is a list, the return value will be a sub-table made up from
the specified rows (list of integers) or columns (list of strings).
If key is a tuple of strings, this will also return a sub-table made
up from the specified columns.
replace the row (key is an integer) or column (key is a string) or
field (key is a 2-tuple (<integer>,<integer>) or
(<integer>,<string>)) specified by key with value.
If key is an integer, value is expected to have an .items()
method that returns <column key> : <column value> pairs.
If key is a string, value is expected to be a sequence object of
the same length as the table.
extend this table with the columns of the table other. If inPlace
is true, columns of other are added to the current table; otherwise,
a copy of the extended table is returned. If addIndices is true,
all indices defined on other are added as well.
Note that the row labels are not updated or checked for
inconsistencies.
returns the index of the record for which the index specified by
indexName has the value given in indexValue. Raises an IndexError
if no such record exists and a KeyError if the specified index does
not exist.
insert the data in recordData into the table at position position.
recordData is a mapping object with a .items() method or None, in
which case an empty record is appended.
returns a set of the levels of the (ordinal) data specified by
columnKeyOrKeys, which may be either a single column key or a
tuple of column keys. In the latter case, levels are formed by
concatenating the data from the specified columns (after conversion
to strings).
renames the column oldColumnKey to newColumnKey. All indices
containing oldColumnexplicitly are updated. An error is raised
if an index has a callable as value function and might implicitly
depend on the name of the old column.
replaces pattern pat in field columnKey with replacement rep.
If rep is a dictionary, it is expected to provide a replacement
string for each matched pattern in pat, i.e. it should look like
{ <pattern name> : <replacement string> }
simple record select operation - returns a table containing only
records for which the function conditionF returns true. If
key is None, conditionF will be passed each record
contained in the table in turn; otherwise, it will be passed only the
data from the specified column(s).
Note that the returned table will not have any of the indices defined
on the source table.
physically sorts the table by the column column, which can be given
as integer, as column label, as a 1-element sequence of a column key or
index, or as an arbitrary-length sequence of column keys for
multi-column sorting. Note that multi-column sorting is fairly
inefficient (uses a temporary TableIndex).
sets all key:value pairs given in slotValueD as attributes of instance.
Checks first if any of the keys in slotValueD is _not_ a slot of
instance, in which case a ValueError is raised.
CODES = {'append_inconsistent_type': ('inconsistent data types during append operation encountered', ''), 'end_of_file': ('End of file encountered during operation!', ''), 'extend_inconsistent_shape': ('extension of a table only works if both tables have the same number of rows', ''), 'extend_inconsistent_types': ('cannot extend a table with an object that is not a table', ''), 'invalid_key_list': ('invalid column key list found', ''), 'key_not_found': ('column key not found', ''), 'malformed_index': ('invalid table index', ''), 'value_conversion_failed': ('could not convert values to the specified column types', '')}
CODES = {'init_invalid_columnkeys': ('Error during initialization: invalid column keys.', ''), 'init_invalid_typecode': ('Error during initialization: invalid typecode information', ''), 'init_record_invalid_options': ('Error during initialization: options passed that are only valid for a table', ''), 'init_sequence_conversion_shape': ('Error during initialization: irregular shape of input data sequence', ''), 'unknown_input_type': ('Unknown type of input data', '')}
indexObject is either a callable which accepts a record and
returns a value, a string referring to a column name or a
list/tuple of strings referring to several column names.
The maintain flag signifies that this index should be
updated whenever record is inserted or deleted into the table.
The persistent flag signifies that this index should be
stored to disk together with the table data.
find the record index for which the index assumes the value
indexValue. If exact is true, raise an IndexError if no such index
is found; otherwise, return the nearest record index with a value just
bigger than indexValue (or raise an IndexError, if the biggest
available value is still smaller than indexValue).
sets all key:value pairs given in slotValueD as attributes of instance.
Checks first if any of the keys in slotValueD is _not_ a slot of
instance, in which case a ValueError is raised.
the two ordinal columns serve as column and row labels for the
resulting table, respectively
the numeric column provides the values for the cells of the
resulting table; each cell represents the sum of all values of
the numberic column that were associated with a particular
row/column value combination. If None is passed as numeric
field, the value 1 is assumed instead.
In the simplest case, specify one ordinal column to form the column
levels ("variables"), one ordinal column to form the row levels
("samples"), and optionally one numeric column to supply the values
for each unique column/row level combination in the resulting
contingency table.
It is also possible to pass tuples of column keys for the column
and/or the row levels arguments, in which case the levels will be
formed by concatenating the values from all specifying columns
(after conversion to strings).
statistics: by default, all values in the same contingency
category (i.e, the same column/row level combination) will be
summed up (using Numeric.add). Change this behavior by passing
a custom function (e.g., pdk.Math.Descriptive.average) here
mode: "R" or "Q". "Q" essentially returns an inverted
("variables-by-samples") table
Value:
returns a Table instance which has the column levels as
column keys, the row levels as row labels, and the result
of applying the statistics function to all unique column/row
level combinations as cell values.
converts the data in table to a delimited ASCII-file
(record-structured) and writes it to outFileNameOrStream, which is
either a file name (for a file to be opened with mode fileMode) or a
stream of some sort with a .write() method.
The first row of the output contains the field labels (unless
writeColumnLabels is false), the first column of each row the row
labels, if present (and if writeRowLabels is true). fieldDelimiter
will be used to delimit data fields, stringDelimiter to delimit
string values.
reads raw data (field-structured: FIELD1 FIELD2 ... FIELDn)
from an ASCII file or stream into a datatable. Default field delimiter
is DEFAULTFIELDDELIMITER, default string delimiter is
DEFAULTSTRINGDELIMITER.
If hasColumnKeys == True, the first line is assumed to contain
the feld (=column) labels.
If hasRowLabels == True or the first column label is
ROW_LABELS, the first data column is treated as row identifiers.
Note that further keyword arguments are passed to the .new method
that is used to create the imported table.
columnTypeCodes: a sequence of type codes, one for each column
(defaults to "c" for all columns)
convertFromStrings: if True, the data in data are treated as
strings and a conversion to the type code(s) specified by
typeCode is attempted
optionD: further options for the Table constructor
If initialized empty, the length of the columnKeyL and rowLabels
parameters determines the shape of the underlying data array (the
number of rows is zero if rowLabels is None).
convenience function to aggregate a Table instance table grouped by
groupKey using aggregationKey and the aggregation statistics
statistics. See the GroupedTable documentation for more details.