tommy.controller.corpus_controller.CorpusController
- class tommy.controller.corpus_controller.CorpusController[source]
Bases:
object
The corpus controller class is responsible for handling interactions with the corpus model.
- change_config_model_refs(corpus_model: CorpusModel) None [source]
Sets the reference to the corpus model :param corpus_model: The corpus model :return: None
- corpus_version_id: int = -1
- extract_and_store_metadata(input_folder_path: str) None [source]
Gets the metadata from all files in the directory specified by the project settings and stores it in the corpus model
- Parameters:
input_folder_path – The new path to the input folder
- Returns:
None
- fileParsers: GenericFileImporter = <tommy.controller.file_import.generic_file_importer.GenericFileImporter object>
- get_dictionary() Dictionary [source]
Get the dictionary corresponding to the bag-of-words representation of the pre-processed documents. It is only set after pre-processing has been completed.
- Returns:
the dictionary of the pre-processed documents
- get_metadata() list[Metadata] [source]
Gets the metadata from all files in the corpus model. This method assumes that extract_and_store_metadata has already been called.
- Returns:
The metadata of the files in the corpus
- get_processed_corpus() ProcessedCorpus [source]
Get an iterable of the processed corpus. Only works after pre-processing has been completed.
- Returns:
The pre-processed files and a reference to their metadata
- get_raw_bodies() Generator[RawBody, None, None] [source]
Get a generator that reads all the raw file contents from the input folder
- Returns:
A generator for just the contents of the raw corpus,
but without the metadata
- get_raw_files() Generator[RawFile, None, None] [source]
Get a generator that reads all the raw file contents and their metadata from the input folder
- Returns:
A generator of the raw corpus
- metadata_available() bool [source]
Check if the metadata is available in the corpus model
- Returns:
True if metadata is available, False otherwise
- property metadata_changed_event: Metadata'>]]
This event gets triggered every time the metadata of the corpus is changed, so the UI can update itself to show the metadata :return:
- on_input_folder_path_changed(input_folder_path: str) None [source]
Gets the metadata from all files in the directory specified by the project settings and stores it in the corpus model and triggers the metadata-changed-event
- Parameters:
input_folder_path – The new path to the input folder
- Returns:
None
- preprocess_corpus() ProcessedCorpus [source]
Preprocessed the corpus and save it in the corpus model
- set_controller_refs(project_settings_controller: ProjectSettingsController, preprocessing_controller: PreprocessingController) None [source]
Sets the reference to the project settings controller, and subscribes to the publisher of project settings :param project_settings_controller: the project settings controller :param preprocessing_controller: the preprocessing controller :return: None
- set_dictionary(dictionary: Dictionary) None [source]
Set the dictionary corresponding to the bag-of-words representation of the pre-processed documents.
- Parameters:
dictionary – corpora.Dictionary: the dictionary of the
pre-processed documents :return: None
- set_model_refs(corpus_model: CorpusModel) None [source]
Sets the reference to the corpus model :param corpus_model: The corpus model :return: None