ABXpy Package¶
ABXpy Package¶
ABX discrimination is a term that is used for three stimuli presented on an ABX trial. The third is the focus. The first two stimuli (A and B) are standard, S1 and S2 in a randomly chosen order, and the subjects’ task is to choose which of the two is matched by the final stimulus (X). (Glottopedia)
This package contains the operations necessary to initialize, calculate and analyse the results of an ABX discrimination task.
Organisation¶
It is composed of 3 main modules and other submodules.
- task module is used for creating a new task and preprocessing.
- distance package is used for calculating the distances necessary for the score calculation.
- score module is used for computing the score of a task.
- analyze module is used for analysing the results.
The features can be calculated in numpy via external tools, and made compatible with this package with the npz2h5features function
The pipeline¶
#TODO the table doesnt render well, do a graphic version in a line
In | Module | Out |
---|---|---|
|
task |
|
|
distance |
|
|
score |
|
|
analyse |
|
See Files Format for a description of the files used as input and output.
analyze Module¶
- ABXpy.analyze.analyze(scorefile, taskfile, outfile)[source]¶
Analyse the results of a task
Parameters: task_file : string, hdf5 file
the file containing the triplets and pairs of the task
score_file : string, hdf5 file
the file containing the score of a task
analyse_file: string, csv file
the file that will contain the analysis
- ABXpy.analyze.collapse(scorefile, taskfile, fid)[source]¶
Collapses the results for each triplets sharing the same on, across and by labels.
score Module¶
This module is used for computing the score of a task (see task Module on how to create a task)
This module contains the actual computation of the score. It requires a task and a distance, and redirect the output in a score file.
The main function takes a distance file and a task file as input to compute the score of the task on those distances. X closer to A is associated with a score of 1 and X closer to B with score of -1.
The distances between pairs in the distance file must be ordered the same way as the pairs in the task file, and the triplet score int the output file will be ordered the same way as the triplets in the task file.
Usage¶
Form the command line:
python score.py data.abx data.distance data.score
In python:
import ABXpy.task
import ABXpy.score
# create a new task:
myTask = ABXpy.task.Task('data.item', 'on_feature', 'across_feature', 'by_feature', filters=my_filters, regressors=my_regressors)
myTask.generate_triplets()
#initialise distance
#TODO shouldn't this be available from score
# calculate the scores:
ABXpy.score('data.abx', 'myDistance.???', 'data.score')
- ABXpy.score.score(task_file, distance_file, score_file=None, score_group='scores')[source]¶
Calculate the score of a task and put the results in a hdf5 file.
Parameters: task_file : string
The hdf5 file containing the task (with the triplets and pairs generated)
distance_file : string
The hdf5 file containing the distances between the pairs
score_file : string, optional
The hdf5 file that will contain the results
task Module¶
This module is used for creating a new task and preprocessing.
This module contains the functions to specify and initialise a new ABX task, compute and display the statistics, and generate the ABX triplets and pairs.
It can also be used in a command line. See task –help for the documentation
Usage¶
Form the command line:
python task.py my_data.item -o column1 -a column2 column3 -b column4 column5 -f "[attr == 0 for attr in column3_X]"
my_data.item is a special file containing an index of the database and a set of labels or attributes. See input format [#TODO insert hypertext]
In python:
import ABXpy.task
# create a new task and compute the statistics
myTask = ABXpy.task.Task('data.item', 'on_label', 'across_feature', 'by_label', filters=my_filters, regressors=my_regressors)
print myTask.stats # display statistics
myTask.generate_triplets() # generate a h5db file 'data.abx'containing all the triplets and pairs
Example¶
#TODO this example is for the front page or ABX module, to move An example of ABX triplet:
A | B | X |
---|---|---|
on_1 | on_2 | on_1 |
ac_1 | ac_1 | ac_2 |
by | by | by |
A and X share the same ‘on’ attribute; A and B share the same ‘across’ attribute; A,B and X share the same ‘by’ attribute
- class ABXpy.task.Task(db_name, on, across=None, by=None, filters=None, regressors=None, verbose=0, verify=True, features=None)[source]¶
Bases: object
Define an ABX task for a given database.
Parameters: db_name : str
the filename of database on which the ABX task is applied.
on : str
the ‘on’ attribute of the ABX task. A and X share the same ‘on’ attribute and B has a different one.
across : list, optional
a list of strings containing the ‘across’ attributes of the ABX task. A and B share the same ‘across’ attributes and X has a different one.
by : list, optional
a list of strings containing the ‘by’ attributes of the ABX task. A,B and X share the same ‘by’ attributes.
filters : list, optional
a list of string specifying a filter on A, B or X.
regressors : list, optional
a list of string specifying a filter on A, B or X.
verbose : int, optional
display additionnal information is set superior to 0.
verify : str, optionnal
verify the correctness of the database file, do by default.
features : str, otpionnal
the features file. Add it to verify the consistency with the item file
Attributes
stats (dict. Contain several statistics about the task. The main 3 attributes are:) - nb_blocks the number of blocks of ABX triplets sharing the same ‘on’, ‘across’ and ‘by’ features.
- nb_triplets the number of triplets considered.
- nb_by_levels the number of blocks of ABX triplets sharing the same ‘by’ attribute.
- compute_statistics(approximate=False)[source]¶
Compute the statistics of the task
The number of ABX triplets is exact in most cases if approximate is set to false. The other statistics can only be approxrimate in the case where there are A, B, X or ABX filters.
Parameters: Approximate : bool
approximate the number of triplets
- generate_pairs(output=None)[source]¶
Generate the pairs associated to the triplet list
Note
This function is called by generate_triplets and should not be used independantly
- generate_triplets(output=None, sample=None)[source]¶
Generate all possible triplets for the whole task and the associated pairs
Generate the triplets and the pairs for an ABXpy.Task and store it in a h5db file.
Parameters: output : filename, optional
The output file. If not specified, it will automatically create a new file with the same name as the input file.
sample : bool, optional
apply the function on a sample of the task
- on_across_triplets(by, on, across, on_across_block, on_across_by_values, with_regressors=True)[source]¶
Generate all possible triplets for a given by block.
Given an on_across_block of the database and the parameters of the task, this function will generate the complete set of triplets and the regressors.
Parameters: by : int
The block index
on, across : int
The task attributes
on_across_block : list
the block
on_across_by_values : dict
the actual values
with_regressors : bool, optional
By default, true
Returns: triplets : numpy.Array
the set of triplets generated
regressors : numpy.Array
the regressors generated