tramway.core package

tramway.core.analyses module

tramway.core.analyses.base module

class tramway.core.analyses.base.AnalysesView(analyses)

Bases: dict

keys() → a set-like object providing a view on D's keys
class tramway.core.analyses.base.InstancesView(analyses)

Bases: tramway.core.analyses.base.AnalysesView

get(k[, d]) → D[k] if k in D, else d. d defaults to None.
items() → a set-like object providing a view on D's items
pop(k[, d]) → v, remove specified key and return the corresponding value.

If key is not found, d is returned if given, otherwise KeyError is raised

values() → an object providing a view on D's values
class tramway.core.analyses.base.CommentsView(analyses)

Bases: tramway.core.analyses.base.AnalysesView

items() → a set-like object providing a view on D's items
values() → an object providing a view on D's values
class tramway.core.analyses.base.Analyses(data=None, metadata=None)

Bases: object

Analysis tree - Generic container with labels, comments and other metadata to structure the analyses that apply to the same data.

An Analyses object is a node of a tree. In attribute data (or equivalently artefact) it contains the input data for the children analyses, and these children analyses can be accessed as subtrees using a dict-like interface.

Labels of the children analyses can be listed with property labels. A label is a key in the dict-like interface.

Comments associated to children analyses are also addressable with labels.

Metadata are attached to each node, including the top node.

Setting the data attribute unsets the other attributes.

Example:

## let `my_input_data` and `my_output_data` be dataframes:
#my_output_data = my_analysis(my_input_data)

## build the tree
tree = Analyses(my_input_data) # root node

tree.add(my_output_data, label='my analysis', comment='description of my analysis')
# or equivalently (order matters):
tree['my analysis'] = my_output_data
tree.comments['my analysis'] = 'description of my analysis'

## print
print(tree)
#<class 'pandas.core.frame.DataFrame'>
#        'my analysis' <class 'pandas.core.frame.DataFrame'>:    "description of my analysis"

assert tree.data is my_input_data
# note that `my_output_data` has been automatically wrapped into an `Analysis` object:
assert isinstance(tree['my analysis'], Analyses)
assert tree['my analysis'].data is my_output_data

print(tree.labels) # or print(tree.keys())
#dict_keys(['my analysis'])

print(tree.comments['my analysis'])
#description of my analysis
data/artefact

input data to the children instances.

Type:any
instances

analyses on the data; keys are natural integers or string labels.

Type:dict
comments

comments associated to the analyses; keys are a subset of the keys in instances.

Type:dict
metadata

additional metadata associated to the input data; keys are attributes and are not related to children instances.

Type:dict
add(analysis, label=None, comment=None, raw=False)

Add an analysis.

Adding an analysis at an existing label overwrites the existing analysis instance and deletes the associated comment if any.

Parameters:
  • analysis (any) – analysis instance.
  • label (any) – key for the analysis; calls autoindex() if undefined.
  • comment (str) – associated comment.
  • raw (bool) – if analysis is not an Analyses, it is wrapped into such a container object; set raw to True to prevent wrapping.
artefact
autoindex(pattern=None)

Determine the lowest available natural integer for use as key in instances and comments.

If pattern is an integer, autoindex returns the pattern unchanged.

Parameters:pattern (str) – label with a ‘*’ to be replaced by a natural integer.
Returns:index or label.
Return type:int or str
comments
data
instances
keys()
labels
metadata
tramway.core.analyses.base.map_analyses(fun, analyses, label=False, comment=False, metadata=False, depth=False, allow_tuples=False)
tramway.core.analyses.base.extract_analysis(analyses, labels)

Extract an analysis from a hierarchy of analyses.

The elements of an Analyses instance can be other Analyses objects. As such, analyses are structured in a tree that exhibits as many logically-consecutive layers as there are processing steps.

Parameters:
  • analyses (tramway.core.analyses.base.Analyses) – hierarchy of analyses, with instances possibly containing other Analyses instances.
  • labels (int, str or sequence of int and str) – analyses label(s); the first label addresses the first layer of analyses instances, the second label addresses the second layer of analyses and so on.
Returns:

copy of the analyses along the path defined by labels.

Return type:

tramway.core.analyses.base.Analyses

tramway.core.analyses.base.label_paths(analyses, filter)

Find label paths for analyses matching a criterion.

Parameters:
Returns:

list of label paths to matching analyses.

Return type:

list of tuples

tramway.core.analyses.base.find_artefacts(analyses, filters, labels=None, quantifiers=None, fullnode=False, return_subtree=False)

Find related artefacts.

Filters are applied to find data elements (artefacts) along a single path specified by labels.

Parameters:
  • analyses (tramway.core.analyses.base.Analyses) – hierarchy of analyses.
  • filters (type or callable or tuple or list) – list of criteria, a criterion being a boolean function or a type.
  • labels (list) – label path.
  • quantifiers (str or tuple or list) – list of quantifers, a quantifier for now being either ‘first’, ‘last’ or ‘all’; a quantifier should be defined for each filter; default is ‘last’ (admits value None).
  • return_subtree (bool) – return as extra output argument the analysis subtree corresponding to the deepest matching artefact.
Returns:

matching data elements/artefacts, and optionally analysis subtree.

Return type:

tuple

Examples:

cells, maps = find_artefacts(analyses, (CellStats, Maps))

maps, maps_subtree = find_artefacts(analyses, Maps, return_subtree=True)
tramway.core.analyses.base.coerce_labels_and_metadata(analyses)
tramway.core.analyses.base.coerce_labels(analyses)
tramway.core.analyses.base.format_analyses(analyses, prefix='\t', node=<class 'type'>, global_prefix='', format_standalone_root=None, metadata=False, annotations={})
tramway.core.analyses.base.append_leaf(analysis_tree, augmented_branch, overwrite=False)

Merge new analyses into an existing analysis tree.

Only leaves and missing branches are appended. Existing nodes with children nodes are left untouched.

Parameters:
tramway.core.analyses.base.valid_label(label)

tramway.core.analyses.lazy module

class tramway.core.analyses.lazy.InstancesView(analyses, peek=False)

Bases: tramway.core.analyses.base.InstancesView

get(k[, d]) → D[k] if k in D, else d. d defaults to None.
peek
pop(k[, d]) → v, remove specified key and return the corresponding value.

If key is not found, d is returned if given, otherwise KeyError is raised

class tramway.core.analyses.lazy.Analyses(data=None, metadata=None)

Bases: tramway.core.analyses.base.Analyses

comments
data
instances
metadata
terminate()

Close the opened file if any and delete all the handles.

type
tramway.core.analyses.lazy.find_artefacts(analyses, filters, labels=None, quantifiers=None, lazy=False, return_subtree=False)

Find related artefacts.

Filters are applied to find data elements (artefacts) along a single path specified by labels.

Parameters:
  • analyses (tramway.core.analyses.base.Analyses) – hierarchy of analyses.
  • filters (type or callable or tuple or list) – list of criteria, a criterion being a boolean function or a type.
  • labels (list) – label path.
  • quantifiers (str or tuple or list) – list of quantifers, a quantifier for now being either ‘first’, ‘last’ or ‘all’; a quantifier should be defined for each filter; default is ‘last’ (admits value None).
  • lazy (bool) – if applying a filter function to a rwa.lazy.LazyPeek, whether to pass the lazy or the evaluated form.
  • return_subtree (bool) – return as extra output argument the analysis subtree corresponding to the deepest matching artefact.
Returns:

matching data elements/artefacts, and optionally analysis subtree.

Return type:

tuple

Examples

cells, maps = find_artefacts(analyses, (CellStats, Maps))

maps, maps_subtree = find_artefacts(analyses, Maps, return_subtree=True)
tramway.core.analyses.lazy.label_paths(analyses, filter, lazy=False)

Find label paths for analyses matching a criterion.

Parameters:
  • analyses (tramway.core.analyses.base.Analyses) – hierarchy of analyses, with instances possibly containing other Analyses instances.
  • filter (type or callable) – criterion over analysis data.
  • lazy (bool) – if applying filter function to a rwa.lazy.LazyPeek, whether to pass the lazy or the evaluated form.
Returns:

list of label paths to matching analyses.

Return type:

list of tuples

tramway.core.analyses.auto module

class tramway.core.analyses.auto.Analyses(data=None, metadata=None, rwa_file=None, autosave=False)

Bases: tramway.core.analyses.auto.LazyAnalysesProxy, tramway.core.analyses.auto.AutosaveCapable

autosaving analyses.

Argument and attribute rwa_file designate the output file.

add(analysis, label=None, comment=None, raw=False)
analyses
flag_as_modified()
classmethod from_rwa_file(input_file, output_file=None, **kwargs)
handler
hooks
modified(recursive=False)
postprocess(analyses)
preprocess(analyses)
reset_modification_flag(recursive=False)
rwa_file
save(out_of_context=False)

should call self.reset_modification_flag(True)

save_options
statefree()
class tramway.core.analyses.auto.AutosaveCapable(autosave=True)

Bases: object

Abstract class.

Children classes, if slotted, should define:

  • _default_autosave_policy (bool or str),
  • _active_autosave_policy (bool or str),

An autosave policy can take various values if of str type:

  • ‘on completion’ saves once on normal completion of the entire process (default)
  • ‘on termination’ saves once on termination of the entire process, whether it is successful or not
  • ‘on every step’ saves on every successful step

If _active_autosave_policy is True, then _default_autosave_policy applies.

active
autosave
autosave_policy
autosaving(policy=None)
force_save
save()

should call self.reset_modification_flag(True)

save_on_completion
save_on_every_step

tramway.core.chain module

class tramway.core.chain.Matrix(size, shape, dtype, order)

Bases: tuple

dtype

Alias for field number 2

order

Alias for field number 3

shape

Alias for field number 1

size

Alias for field number 0

class tramway.core.chain.ArrayChain(*members, **kwargs)

Bases: object

at(a, member)
get(a, member)
members
order
set(a, member, m)
shape
size
class tramway.core.chain.ChainArray(*members, **kwargs)

Bases: tramway.core.chain.ArrayChain

combined
update(x)

tramway.core.exceptions module

exception tramway.core.exceptions.EfficiencyWarning

Bases: RuntimeWarning

exception tramway.core.exceptions.FileNotFoundWarning

Bases: tramway.core.exceptions.IOWarning

exception tramway.core.exceptions.IOWarning

Bases: Warning

exception tramway.core.exceptions.MisplacedAttributeWarning

Bases: UserWarning

exception tramway.core.exceptions.MissingSupportWarning

Bases: UserWarning

exception tramway.core.exceptions.MultipleArgumentError

Bases: ValueError

exception tramway.core.exceptions.NaNWarning

Bases: RuntimeWarning

exception tramway.core.exceptions.RWAFileException(filepath=None, exc=None)

Bases: OSError

exception tramway.core.exceptions.SideEffectWarning

Bases: UserWarning

tramway.core.hdf5 package

This module implements the Storable class for TRamWAy datatypes.

tramway.core.hdf5.load_rwa(path, verbose=None, lazy=False, force_load_spt_data=None)

Load a .rwa file.

Note about laziness: the analysis tree uses an active handle to the opened file. As a consequence, the file should be read only once. It is safe to load, modify, save and then load again, but the first loaded data should be terminated before loading the data again.

Parameters:
  • path (str) – path to .rwa file
  • verbose (bool or int) – verbosity level
  • lazy (bool) – reads the file lazily
  • force_load_spt_data (bool) – new in 0.5 compatibility flag for pre-0.5 code; None currently defaults to True, but will default to False in the future
Returns:

analysis tree; if lazy is True, return type is tramway.core.analyses.lazy.Analyses instead

Return type:

tramway.core.analyses.base.Analyses

tramway.core.hdf5.save_rwa(path, analyses, verbose=False, force=None, compress=True, append=False, overwrite=None)

Save an analysis tree into a .rwa file.

Parameters:
  • path (str) – path to .rwa file
  • analyses (tramway.core.analyses.base.Analyses) – analysis tree
  • verbose (bool or int) – verbose mode
  • force/overwrite (bool) – do not ask whether to overwrite an existing file or not
  • compress (bool) – delete the lazy attributes that can be computed again automatically
  • append (bool) – do not overwrite; reload the file instead and append the analyses as a subtree
tramway.core.hdf5.poke_maps(store, objname, self, container, visited=None, _stack=None, legacy=False)
tramway.core.hdf5.peek_maps(store, container, _stack=None)

tramway.core.hdf5.compat module

tramway.core.hdf5.compat.translate_types(translation_table)

Translate types for rwa files.

tramway.core.hdf5.store module

tramway.core.hdf5.store.load_rwa(path, verbose=None, lazy=False, force_load_spt_data=None)

Load a .rwa file.

Note about laziness: the analysis tree uses an active handle to the opened file. As a consequence, the file should be read only once. It is safe to load, modify, save and then load again, but the first loaded data should be terminated before loading the data again.

Parameters:
  • path (str) – path to .rwa file
  • verbose (bool or int) – verbosity level
  • lazy (bool) – reads the file lazily
  • force_load_spt_data (bool) – new in 0.5 compatibility flag for pre-0.5 code; None currently defaults to True, but will default to False in the future
Returns:

analysis tree; if lazy is True, return type is tramway.core.analyses.lazy.Analyses instead

Return type:

tramway.core.analyses.base.Analyses

tramway.core.hdf5.store.save_rwa(path, analyses, verbose=False, force=None, compress=True, append=False, overwrite=None)

Save an analysis tree into a .rwa file.

Parameters:
  • path (str) – path to .rwa file
  • analyses (tramway.core.analyses.base.Analyses) – analysis tree
  • verbose (bool or int) – verbose mode
  • force/overwrite (bool) – do not ask whether to overwrite an existing file or not
  • compress (bool) – delete the lazy attributes that can be computed again automatically
  • append (bool) – do not overwrite; reload the file instead and append the analyses as a subtree

tramway.core.lazy module

exception tramway.core.lazy.PermissionError(property_name, related_attribute)

Bases: AttributeError

tramway.core.lazy.ro_property_assert(obj, supplied_value, related_attribute=None, property_name=None, depth=0)
class tramway.core.lazy.Lazy

Bases: object

Lazy store.

Lazily computes and stores attributes through properties, so that the stored attributes can be (explicitly) deleted anytime to save memory.

The __lazy__ static attribute is a list of the properties that implement such a mechanism.

Per default each lazy property name manages a private _name attribute. This naming convention can be overwritten by heriting Lazy and overloading __tolazy__() and __fromlazy__() methods.

An unset lazy attribute/property always has value None.

A getter will typically look like this:

@property
def name(self):
    if self._name is None:
        self._name = # add some logics
    return self.__lazyreturn__(self._name)

A fully functional setter will typically look like this:

@name.setter
def name(self, value):
    self.__lazysetter__(value)

A read-only lazy property will usually look like this:

@name.setter
def name(self, value):
    self.__lazyassert__(value)

__lazyassert__ can unset _name (set it to None) but any other value is treated as illegal. __lazyassert__ compares value with self.name and raises a warning if the values equal to each other, or throws an exception otherwise.

unload(visited=None)

Recursively clear the lazy attributes.

Beware: only direct Lazy object attributes are unloaded,
not Lazy objects stored in non-lazy attributes!

Deprecated

tramway.core.lazy.lightcopy(x)

Return a copy and call unload if available.

Parameters:x (any) – object to be copied and unloaded.
Returns:copy of x.
Return type:any

Deprecated

tramway.core.namedcolumns module

tramway.core.namedcolumns.isstructured(x)

Check for named columns.

The adjective structured comes from NumPy structured array.

Parameters:x (any) – any datatype
Returns:True if input argument x has named columns.
Return type:bool
tramway.core.namedcolumns.columns(x)

Get column names.

Parameters:x (any) – datatype that satisfies isstructured().
Returns:column iterator.
Return type:iterable
Raises:ValueError – if no named columns are found in x.
tramway.core.namedcolumns.splitcoord(varnames, asstr=True)
tramway.core.namedcolumns.expandcoord(varname, dim)

tramway.core.parallel package

class tramway.core.parallel.StarConn(input_queue=None, output_queues=None)

Bases: object

close()
empty()
full()
get(block=True, timeout=None)
get_nowait()
input
join_thread()
joinable
output
put(obj, block=True, timeout=None)
put_nowait(obj)
qsize()
task_done()
variant
class tramway.core.parallel.StarQueue(n, variant=<bound method BaseContext.Queue of <multiprocessing.context.DefaultContext object>>, **kwargs)

Bases: object

A star queue is a multidirectional queue such that each message must be consumed by every processes but the sender. This is useful to send asynchronous updates between processes in a distributed setting.

A StarQueue is instanciated first, and then StarConn objects are delt to children processes. The StarConn objects are the actual queues.

deal()
deck
class tramway.core.parallel.ProtoWorkspace(args=())

Bases: object

identify_extensions(args)
pop_extension_updates()
push_extension_updates(updates)
resources(step)
update(step)
class tramway.core.parallel.Workspace(data_array, *args)

Bases: tramway.core.parallel.ProtoWorkspace

Parameter singleton.

data_array

working copy of the parameter vector.

Type:array-like
data_array
class tramway.core.parallel.JobStep(_id, workspace=None)

Bases: object

Job step data.

A job step object contains all the necessary input data for a job step to be performed as well as the output data resulting from the step completion.

A job step object merely contains a reference to a shared workspace.

The resource_id attribute refers to a series of job steps that operate on the same subset of resource items.

Multiple steps can operate simultaneously in the same workspace in a distributed fashion provided that they do not compete for the same resources.

resources is an index array that designates the items of shared data to be accessed. This attribute is used by Scheduler to lock the required items of data, which determines which steps can be run simultaneously.

get_workspace()
resource_id
resources
set_workspace(ws)
unset_workspace()
workspace_set
class tramway.core.parallel.UpdateVehicle

Bases: object

Not instanciable! Introduced for __slots__-enabled multiple inheritance.

Example usage, in the case class B implements (abc.) VehicleJobStep and class A can only implement (abc.) JobStep and not inherit from VehiculeJobStep:

class A:
    __slots__ = 'a',
abc.JobStep.register(A)
class B(A, UpdateVehicle):
    __slots__ = ('b', ) + VehicleJobStep.__slots__
    def __init__(self, a, b):
        A.__init__(self, a)
        UpdateVehicle.__init__(self)
        self.b = b
abc.VehicleJobStep.register(B)

VehicleJobStep brings the slots, UpdateVehicle brings the implementation (methods) and abc.VehicleJobStep the typing required by Workspace and Worker to handle B as a VehicleJobStep.

pop_updates()
push_updates(updates)
class tramway.core.parallel.VehicleJobStep(_id, workspace=None)

Bases: tramway.core.parallel.JobStep, tramway.core.parallel.UpdateVehicle

class tramway.core.parallel.Worker(_id, workspace, task_queue, return_queue, update_queue, name=None, args=(), kwargs={}, daemon=None, **_kwargs)

Bases: multiprocessing.context.Process

Worker that runs job steps.

The target() method may be implemented following the pattern below:

class MyWorker(Worker):
    def target(self, *args, **kwargs):
        while True:
            k, task = self.get_task()
            status = dict()
            try:
                # modify `task` and `status`
            finally:
                self.push_update(task, status)

The optional positional and keyword arguments come from the args and kwargs arguments to Scheduler.__init__(), plus the extra keyword arguments to the latter constructor.

get_task()

Listen to the scheduler and get a job step to be run.

The job step is loaded with the worker-local copy of the synchronized workspace.

Returns:step/iteration number, job step object.
Return type:int, tramway.core.parallel.JobStep
pull_updates()
push_update(update, status=None)

Send a completed job step back to the scheduler and to the other workers.

Parameters:
run()

Method to be run in sub-process; can be overridden in sub-class

class tramway.core.parallel.Scheduler(workspace, tasks, worker_count=None, iter_max=None, name=None, args=(), kwargs={}, daemon=None, max_runtime=None, task_timeout=None, **_kwargs)

Bases: object

Scheduler that distributes job steps over a shared workspace.

Workers are spawned and assigned job steps. Each worker maintains a copy of the common workspace that is synchronized by Worker.push_update() calls.

The stop() method should be overloaded so that the distributed computation may complete on termination criteria.

available_slots

Number of available workers.

Type:int
draw(k)
fill_slots(k, postponed)

Send as many job steps as there are available workers.

Parameters:
  • k (int) – step/iteration number.
  • postponed (dict) – postponed job steps.
Returns:

new step/iteration number.

Return type:

int

get_processed_step()

Retrieve a processed job step and check whether stopping criteria are met.

Calls the stop() method.

Returns False if a stopping criterion has been met, True otherwise.

init_resource_lock()
iter_max_reached()
lock(step)
locked(step)
next_task()

no-mp mode only

pseudo_stop(status)
pseudo_worker
run()

Start the workers, send and get job steps back and check for stop criteria.

Returns True on normal completion, False on interruption (SystemExit, KeyboardInterrupt).

send_task(k, step)

Send a job step to be assigned to a worker as soon as possible.

Parameters:
stop(k, i, status)

Default implementation returns False.

Parameters:
  • k (int) – step/interation number.
  • i (int) – step id.
  • status (dict) – status data returned by a job step.
Returns:

True if a stopping criterion has been met, False otherwise.

Return type:

bool

timeout
unlock(step)
worker
worker_count
workers_alive()
class tramway.core.parallel.EpochScheduler(workspace, tasks, epoch_length=None, soft_epochs=False, worker_count=None, iter_max=None, name=None, args=(), kwargs={}, daemon=None, **_kwargs)

Bases: tramway.core.parallel.Scheduler

draw(k)
start_new_epoch(task_order)

must modify the task_order array inplace

tramway.core.parallel.abc module

class tramway.core.parallel.abc.JobStep

Bases: object

Job step data.

A job step object contains all the necessary input data for a job step to be performed as well as the output data resulting from the step completion.

A job step object merely contains a reference to a shared workspace.

The resource_id attribute refers to a series of the job steps that operate on the same subset of resource items.

get_workspace()
resource_id

Resource-related job ID.

A job step is one of many that operate on a same subset of resources. resource_id uniquely designates this specific subset of resources.

Type:int
resources

Indices or keys of the required items of resource in the workspace.

May be implemented as follows:

return self.get_workspace().resources(self)
Type:sequence
set_workspace(ws)
unset_workspace()
workspace_set

Is the workspace set?

Type:bool
class tramway.core.parallel.abc.Workspace

Bases: object

Parameter singleton.

A computation will typically instanciates a unique workspace that will be embedded in and shared between multiple JobStep instances.

It embarks resources that a job step may access.

resources(step)

May be implemented as follows:

return step.resources

See also: JobStep.resources.

update(step)

Update the workspace with a completed job step.

Parameters:step (tramway.core.parallel.JobStep) – completed job step.

May be implemented as follows:

step.set_workspace(self)
class tramway.core.parallel.abc.WorkspaceExtension

Bases: object

See also: ExtendedWorkspace.

pop_workspace_update()
push_workspace_update(upload)
class tramway.core.parallel.abc.ExtendedWorkspace

Bases: tramway.core.parallel.abc.Workspace

Workspace that listens to external objects which implement WorkspaceExtension provided that the job step data implements VehicleJobStep.

pop_extension_updates(updates)
push_extension_updates(updates)
class tramway.core.parallel.abc.VehicleJobStep

Bases: object

Job step that keeps a reference to workspace-like objects that are not part of the main workspace.

Changes to these external workspace-like objects are encoded together with the job step data so that it can be passed to and replayed in the other workers’ workspaces.

pop_updates()
push_updates(updates)

tramway.core.plugin module

tramway.core.plugin.list_plugins(dirname, package, lookup={}, force=False, require=None, verbose=False)
tramway.core.plugin.add_arguments(parser, arguments, name=None)
tramway.core.plugin.short_options(arguments)
class tramway.core.plugin.Plugins(dirname, package, lookup={}, force=False, require=None, verbose=False)

Bases: object

dirname
force
get(mod, default)
items()
keys()
lookup
modules
package
pop(mod, default)
post_load
require
update(plugins)
values()
verbose

tramway.core.scaler module

class tramway.core.scaler.Scaler(scale=None, euclidean=None)

Bases: object

Scaler scales data points, point differences (vectors) or distances.

It initializes itself with the first provided sample (in scale_point()), and then applies the same transformation to the next samples.

A default Scaler() instance does not scale. However, initialization still takes place so that scaled() properly works.

It manages a constraint in the calculation of the scaling parameters, forcing a common factors over a subset of dimensions. Attribute euclidean controls the selection of this subset. Distances are scaled and unscaled only in this subspace, if it is defined.

Beware that when possible data are scaled in place, but scaledonly optional argument, when available, never operates in place.

init

True as long as Scaler has not been initialized.

Type:bool
center

Vector that is substracted to each row of the data matrix to be scaled.

Type:array or pandas.Series
factor

Vector by which each row of the data matrix to be scaled is divided. Applies after center.

Type:array or pandas.Series
columns

Sequence of column names along which scaling applies. This applies only to structured data. columns is determined even if Scaler is set to do nothing, so that scaled() can still apply. columns can be manually set after the first call to scale_point() if data are not structured (do not have named columns).

Type:list or pandas.Index
function

A function that takes a data matrix as input and returns center and factor. function is called once during the first call to scale_point().

Type:callable
euclidean

Sequence of names or indices of the columns to be scaled by a common factor.

Type:list
center
columns
euclidean
factor
function
init
ready

Returns True if scaler is initialized.

scale_distance(dist, inplace=True)
scale_length(dist, inplace=True)
scale_point(points, inplace=True, scaledonly=False, asarray=False)

Scale data.

When this method is called for the first time, the Scaler instance initializes itself for further call of any of its methods.

Parameters:
  • points (array-like) – Data matrix to be scaled. When scale_point() is called for the first time, points can be structured or not, without the unnecessary columns, if any. At further calls of any (un-)scaling method, the input data should be in the same format with at least the same column names, and may feature extra columns.
  • inplace (bool) – Per default, scaling is performed in-place. With inplace=False, points are first copied.
  • scaledonly (bool) – If True, undeclared columns are stripped away out of the returned data.
  • asarray (bool) – If True, the returned data is formatted as a numpy.array.
Returns:

With default optional input arguments, the returned variable will be

a pointer to points, not otherwise.

Return type:

array-like

scale_size(size, dim=None, inplace=True, _unscale=False)

Scale/unscale lengths, surface areas, volumes and other scalar sizes.

The calling Scaler instance must have been initialized.

Parameters:
  • size (array-like) – Values to be scaled, per element.
  • dim (int) – Number of characteristic dimensions, with 0 referring to all the euclidean dimensions (e.g. lengths: 1, areas: 2, volumes: 0).
  • inplace (bool) – Per default, scaling is performed in-place. With inplace=False, size is first copied.
  • _unscale (bool) – If True, unscales instead.
Returns:

scaled values.

Return type:

array-like

scale_surface_area(area, inplace=True)
scale_vector(vect, inplace=True, scaledonly=False, asarray=False)

Scale vectors.

The calling Scaler instance must have been initialized.

Parameters:
  • vect (array-like) – Data matrix to be scaled.
  • inplace (bool) – Per default, scaling is performed in-place. With inplace=False, vect is first copied.
  • scaledonly (bool) – If True, undeclared columns are stripped away out of the returned data.
  • asarray (bool) – If True, the returned data is formatted as a numpy.array.
Returns:

scaled data matrix.

Return type:

array-like

scale_volume(vol, inplace=True)
scaled(points, asarray=False)

Discard columns that are not recognized by the initialized scaler.

Applies to points and vectors, not distances, surface areas or volumes.

unscale_distance(dist, inplace=True)
unscale_length(dist, inplace=True)
unscale_point(points, inplace=True)

Scale data back to original domain.

The calling Scaler instance must have been initialized.

Parameters:
  • points (array-like) – Scaled data matrix to be unscaled.
  • inplace (bool) – Per default, scaling is performed in-place. With inplace=False, points are first copied.
Returns:

unscaled data matrix.

Return type:

array-like

unscale_surface_area(area, inplace=True)
unscale_vector(vect, inplace=True)

Scale vectors back to original range.

The calling Scaler instance must have been initialized.

Parameters:
  • vect (array-like) – Scaled data matrix to be unscaled.
  • inplace (bool) – Per default, scaling is performed in-place. With inplace=False, points are first copied.
Returns:

unscaled data matrix.

Return type:

array-like

unscale_volume(vol, inplace=True)
tramway.core.scaler.whiten()

Returns a Scaler that scales data x following: (x - mean(x)) / std(x).

tramway.core.scaler.unitrange()

Returns a Scaler that scales data x following: (x - min(x)) / (max(x) - min(x)).

tramway.core.xyt module

tramway.core.xyt.translocations(df, sort=False)

Each trajectories should be represented by consecutive rows sorted by time.

Returns displacements without the associated position information.

tramway.core.xyt.iter_trajectories(trajectories, trajnum_colname='n', asslice=False, asarray=False, order=None)

Yields the different trajectories in turn.

If asslice=True, the indices corresponding to the trajectory are returned instead, or as first output argument if asarray=True, as a slice (first index, last index + 1).

order can be ‘start’ to ensure that trajectories are yielded by ascending start time.

tramway.core.xyt.iter_full_trajectories(cropped_trajs, all_trajs, match_cols=['x', 'y', 't'], unique=True)

In the case cropped_trajs results from cropping all_trajs, yields the trajectories in all_trajs that are in cropped_trajs.

This function is helpful for retrieving the original trajectories, with the excluded points included back. Indeed, crop removes the out-of-bound locations, splits the affected trajectories and re-indices the trajectories, so that they are contiguous in time.

There is no need to call translocations_to_trajectories on cropped_trajs beforehands.

If all_trajs actually represent series of translocations, iter_full_trajectories yields full series of translocations.

Argument unique set to False makes iter_full_trajectories yield the trajectories that correspond to those yielded by iter_trajectories, so that both output can be zipped. The following code example iterates over regions of interest using the RWAnalyzer object:

from tramway.core.xyt import *
from tramway.analyzer import *

a = RWAnalyzer()

# ... [define the SPT data and ROI]

for f in a.spt_data:
    all_trajectories = f.dataframe
    for r in f.roi.as_support_regions():
        local_translocations = r.crop()
        local_trajectories = translocations_to_trajectories(local_translocations)
        for cropped_trajectory, full_trajectory in zip(
                iter_trajectories(local_trajectories),
                iter_full_trajectories(local_translocations, all_trajectories, unique=False),
            ):
            # do sommething with `cropped_trajectory` and corresponding `full_trajectory`
            pass
tramway.core.xyt.iter_frames(points, asslice=False, as_trajectory_slices=False, dt=None, skip_empty_frames=True)

Yields series of row indices, each series corresponding to a different frame.

tramway.core.xyt.trajectories_to_translocations(points, exclude_columns=['n'])

Appends delta columns (‘dx’, ‘dy’, ‘dt’, etc) and removes the last location of each trajectory.

See also translocations_to_trajectories.

tramway.core.xyt.translocations_to_trajectories(points)

Reintroduces the last location of each trajectory, and discards the delta columns.

See also trajectories_to_translocations.

tramway.core.xyt.load_xyt(path, columns=None, concat=True, return_paths=False, verbose=False, reset_origin=False, header=None, **kwargs)

Load trajectory files.

Files are loaded with read_csv() and should have the same number of columns and either none or all files should exhibit a single-line header.

Default column names are ‘n’, ‘x’, ‘y’ and ‘t’.

Parameters:
  • path (str or list of str) – path to trajectory file or directory.
  • columns (list of str) – column names.
  • concat (bool) – if multiple files are read, return a single DataFrame.
  • return_paths (bool) – paths to files are returned as second output argument.
  • verbose (bool) – print extra messages.
  • reset_origin (bool or sequence) – the lowest coordinate is translated to 0. Apply to time and space columns. Default column names are ‘x’, ‘y’, ‘z’ and ‘t’. A sequence overrides the default.
  • header (bool) – if defined, a single-line header is expected in the file(s); if False, ignore the header; if True, overwrite the columns argument with names from the header; if undefined, check whether a header is present and, if so, act as True.
Returns:

trajectories as one or multiple DataFrames;

if tuple (with return_paths), the trajectories are first, the list of filepaths second.

Return type:

pandas.DataFrame or list or tuple

Extra keyword arguments are passed to read_csv().

tramway.core.xyt.load_mat(path, columns=None, varname='plist', dt=None, coord_scale=None, pixel_size=None)

Load SPT data from MatLab V7 file.

The two pieces of code below are almost equivalent:

xyt      = load_mat(my_file, columns=list('txy'), dt=frame_interval)
xyt      = load_mat(my_file, columns=['frame_index', 'x', 'y'])
xyt['t'] = xyt['frame_index'] * frame_interval

The only difference resides in the 'frame_index' column that is missing in the first dataframe.

Parameters:
  • path (str) – file path.
  • columns (sequence of str) – data column names; default is [‘t’, ‘x’, ‘y’].
  • varname (str) – record name.
  • dt (float) – frame interval in seconds; if defined together with a ‘t’ column in the data, the original ‘t’ values are considered to be frame indices and are multiplied by dt to transform them into times.
  • coord_scale (float) – convertion factor for spatial coordinates.
  • pixel_size (float) – deprecated; superseded by coord_scale.
Returns:

SPT data.

Return type:

pandas.DataFrame

tramway.core.xyt.crop(points, box, by=None, add_deltas=True, keep_nans=False, no_deltas=False, keep_nan=None, preserve_index=False)

Remove locations outside a bounding box.

When a location is discarded, the corresponding trajectory is splitted into two distinct trajectories.

Important: the locations of any given trajectory should be contiguous and ordered.

Parameters:
  • locations (pandas.DataFrame) – locations with trajectory indices in column ‘n’, times in column ‘t’ and coordinates in the other columns; delta columns are ignored
  • box (array-like) – origin and size of the space bounding box
  • by (str) – for translocations only; ‘start’ or ‘origin’: crop by translocation origin; keep the associated destinations; ‘stop’ or ‘destination’: crop by translocation destinations; keep the associated origins; trajectories with a single non-terminal point outside the bounding box are not splitted
  • add_deltas (bool) – add ‘dx’, ‘dy’, …, ‘dt’ columns is they are not already present; deltas are associated to the translocation origins
  • keep_nans/keep_nan (bool) – adding deltas generates NaN; keep them
  • no_deltas (bool) – do not consider any column as deltas
  • preserve_index (bool) – do not split the trajectories with out-of-bound locations, do not re-index the trajectories nor the rows
Returns:

filtered (trans-)locations

Return type:

pandas.DataFrame

tramway.core.xyt.discard_static_trajectories(trajectories, min_msd=None, trajnum_colname='n', full_trajectory=False, verbose=False, localization_error=None)
Parameters:
  • trajectories (DataFrame) – trajectory or translocation data with columns 'n' (trajectory number), spatial coordinates 'x' and 'y' (and optionaly 'z'), and time 't'; delta columns, if available (translocations), are used instead for calculating the displacement length.
  • min_msd (float) – minimum mean-square-displacement (usually set to the localization error).
  • trajnum_colname (str) – column name for the trajectory number.
  • full_trajectory (float or bool) – if True, the trajectories with static translocations are entirely discarded; if False, only the static translocations are discarded, and the corresponding trajectories are discarded only if they end up being single points; if a float value, trajectories with full_trajectory x 100% static translocations or more are discarded.
  • localization_error (float) – alias for min_msd; for backward compatibility.
Returns:

filtered trajectory data with a new row index.

Return type:

DataFrame

tramway.core.xyt.reindex_trajectories(trajectories, trajnum_colname='n', dt=None)

Splits the trajectories with missing time steps and assigns different indices to the different segments.

Works with trajectories and translocations.