cas.ingest package
Submodules
cas.ingest.config_validator module
- cas.ingest.config_validator.validate(json_object)[source]
Validates the given json configuration object using the cell type annotation schema.
Returns: :type json_object:
object
:param json_object: configuration object :rtype:bool
:return: True if object is valid, False otherwise.- Parameters:
json_object (object)
- Return type:
bool
- cas.ingest.config_validator.validate_file(file_path)[source]
Read the configuration object from the given path and validates it. :type file_path:
str
:param file_path: path to the json file :rtype:bool
:return: True if object is valid, False otherwise.- Parameters:
file_path (str)
- Return type:
bool
cas.ingest.ingest_user_table module
- cas.ingest.ingest_user_table.ingest_data(data_file, config_file, out_file, format='json', print_undefined=False, generate_accession_ids=False)[source]
Ingests given data into standard cell annotation schema data structure using the given configuration.
- Parameters:
data_file (
str
) – Unformatted user data in tsv/csv format.config_file (
str
) – configuration file path.out_file (
str
) – output file path.format (
str
) – Data export format. Supported formats are ‘json’ and ‘tsv’print_undefined (
bool
) – prints null values to the output json if true. Omits undefined values from the json output ifgenerate_accession_ids (bool)
- Return type:
dict
false. False by default. Only effective in json serialization. :type generate_accession_ids:
bool
:param generate_accession_ids: determines if incrementally generate accession_ids for all annotations that don’t have an id. :rtype:dict
:return: output data as dict
- cas.ingest.ingest_user_table.ingest_user_data(data_file, config_file, generate_accession_ids=False)[source]
Ingest given user data into standard cell annotation schema data structure using the given configuration. :type data_file:
str
:param data_file: Unformatted user data in tsv/csv format. :type config_file:str
:param config_file: configuration file path. :type generate_accession_ids:bool
:param generate_accession_ids: determines if incrementally generate accession_ids for all annotations that don’t have an id.- Return type:
- Parameters:
data_file (str)
config_file (str)
generate_accession_ids (bool)
- cas.ingest.ingest_user_table.generate_ids_for_annotations(cas, config, labelset_ranks)[source]
Generates unique IDs for the annotations in the given CellTypeAnnotation object. :type cas:
CellTypeAnnotation
:param cas: CellTypeAnnotation object :type config:dict
:param config: ingestion configuration dictionary :type labelset_ranks:dict
:param labelset_ranks: ranks of the labelsets :rtype:CellTypeAnnotation
:return: CellTypeAnnotation object with generated IDs.- Parameters:
cas (CellTypeAnnotation)
config (dict)
labelset_ranks (dict)
- Return type:
- cas.ingest.ingest_user_table.init_accession_managers(cas, config)[source]
Initializes IncrementalAccessionManager for each labelset in the config. :type cas:
CellTypeAnnotation
:param cas: CellTypeAnnotation object :type config:dict
:param config: ingestion configuration dictionary :rtype:dict
:return: dictionary of IncrementalAccessionManager objects- Parameters:
cas (CellTypeAnnotation)
config (dict)
- Return type:
dict
- cas.ingest.ingest_user_table.register_parent(field, labelset_ranks, parent_ao, parents)[source]
Registers the parent annotation object to the parents list. :type field: :param field: config field :type labelset_ranks: :param labelset_ranks: labelset ranks dictionary :type parent_ao: :param parent_ao: parent to add :type parents: :param parents: sparse parents list
- cas.ingest.ingest_user_table.get_annotation(ao_names, field, record)[source]
Creates a annotation object if it does not exist in the ao_names dictionary at the same labelset. :type ao_names: :param ao_names: list of existing annotation objects :type field: :param field: config field :type record: :param record: data record
Returns: annotation object
- cas.ingest.ingest_user_table.add_user_annotations(ao, headers, record, utilized_columns)[source]
Adds user annotations that are not supported by the standard schema. :type ao: :param ao: current annotation object :type headers: :param headers: all column names of the user data :type record: :param record: a record in the user data :type utilized_columns: :param utilized_columns: list of processed columns
- cas.ingest.ingest_user_table.add_parent_node_names(ao, ao_names, cas, parents)[source]
Creates parent nodes if necessary and creates a cluster hierarchy through assigning parent_node_names. :type ao: :param ao: current annotation object :type ao_names: :param ao_names: list of all created annotation objects :type cas: :param cas: main object :type parents: :param parents: list of current annotation object’s parents