The tools directory contains helper methods and modules that allow for lower-level access to the Mephisto data model than the clients provide. These may be useful for creating custom workflows and scripts that are built on Mephisto.
At the moment this folder contains the following:
MephistoDataBrowseris a convenience tool for accessing all of the units and data associated with a specific task run or task name. It is generally used when reviewing or compiling data.
scripts.py: The methods available in
scripts.pyare to be used in user scripts that rely on Mephisto. At the moment, these scripts allow for easy configuration of a database as well as augmentation of a script config for use in a Mephisto
MephistoDataBrowser at the moment can handle the job of getting all
Units that are associated with a given task or task run. They can also retrieve the relevant data about a
Unit, including the work done for that
Unit, if the
Unit is completed.
It has three usable methods at the moment:
get_units_for_run_id: This will return a list of all final
Unit's associated with the given
task_run_id. These will all be in a terminal state, such as
REJECTED. Units that are still in flight will not appear using this method.
get_units_for_task_name: This will go through all task runs that share the given
task_name, and collect their units in the same manner as
get_data_from_unit: When given a
Unitthat is in a terminal state, this method will return data about that
Unit, including the Mephisto id of the worker, the status of the work, the data saved by this
Unit, and the start and end times for when the worker produced the data.
This file contains a number of helper functions that are useful for running reviews over Mephisto data. We provide a high-level script for doing a 'review-by-worker' style evaluation, as well as breakout helper functions that could be useful in alternate review flows.
run_examine_by_worker: This function takes a function
format_data_for_printingthat consumes the result of
MephistoDataBrowser.get_data_from_unit, and should print out to terminal a reviewable format. It optionally takes in
task_nameis provided, the script will be run in review mode without querying the user for anything.
print_results: This function takes a task name and display function
format_data_for_printing, and an optional int
limit, and prints up to
limitresults to stdout.
format_worker_stats: Takes in a worker id and set of previous worker stats, and returns the previous stats in the format
(accepted_count | total_rejected_count (soft_rejected_count) / total count)
prompt_for_options: Prompts the user for
approve_qualification. If provided as an argument, skips. Returns these values after confirming with the user, and if blank uses
This file contains a few helper methods for running scripts relying on the
MephistoDB. They are as follows:
get_db_from_config: This method takes in a hydra-produced
MephistoConfig(such as a
TaskConfig), and returns an initialized
MephistoDBcompatible with the configuration. Right now this exclusively leverages the
augment_config_from_db: This method takes in a
MephistoDB, parses the content to ensure that a valid requester and architect setup exists, and then updates the config. It also has validation steps that require user confirmation for live runs. It returns the updated config.
load_db_and_process_config: This is a convenience method that wraps the above two methods, loading in the appropriate
MephistoDBand using it to process the script. It returns the db and a valid config.
process_config_and_get_operator: A convenience wrapper of the above method that _also_ creates an
task_script: This decorator is used to register a Mephisto script for launching task runs. It takes in either a
default_config_file(yaml filename without the .yaml) argument to specify how the script is configured, and wraps a main that takes in an
#!/usr/bin/env python3 # Copyright (c) Facebook, Inc. and its affiliates. # This source code is licensed under the MIT license found in the # LICENSE file in the root directory of this source tree. """ .. include:: README.md """ __docformat__ = "restructuredtext"