mephisto.tools
Tools
The tools directory contains helper methods and modules that allow for lower-level access to the Mephisto data model than the clients provide. These may be useful for creating custom workflows and scripts that are built on Mephisto.
At the moment this folder contains the following:
MephistoDataBrowser
: TheMephistoDataBrowser
is a convenience tool for accessing all of the units and data associated with a specific task run or task name. It is generally used when reviewing or compiling data.scripts.py
: The methods available inscripts.py
are to be used in user scripts that rely on Mephisto. At the moment, these scripts allow for easy configuration of a database as well as augmentation of a script config for use in a MephistoTaskRun
.
MephistoDataBrowser
The MephistoDataBrowser
at the moment can handle the job of getting all Unit
s that are associated with a given task or task run. They can also retrieve the relevant data about a Unit
, including the work done for that Unit
, if the Unit
is completed.
It has three usable methods at the moment:
get_units_for_run_id
: This will return a list of all finalUnit
's associated with the giventask_run_id
. These will all be in a terminal state, such asCOMPLETED
,ACCEPTED
orREJECTED
. Units that are still in flight will not appear using this method.get_units_for_task_name
: This will go through all task runs that share the giventask_name
, and collect their units in the same manner asget_units_for_run_id
.get_data_from_unit
: When given aUnit
that is in a terminal state, this method will return data about thatUnit
, including the Mephisto id of the worker, the status of the work, the data saved by thisUnit
, and the start and end times for when the worker produced the data.
examine_utils.py
This file contains a number of helper functions that are useful for running reviews over Mephisto data. We provide a high-level script for doing a 'review-by-worker' style evaluation, as well as breakout helper functions that could be useful in alternate review flows.
run_examine_by_worker
: This function takes a functionformat_data_for_printing
that consumes the result ofMephistoDataBrowser.get_data_from_unit
, and should print out to terminal a reviewable format. It optionally takes intask_name
,block_qualification
, andapprove_qualification
arguments. Iftask_name
is provided, the script will be run in review mode without querying the user for anything.print_results
: This function takes a task name and display functionformat_data_for_printing
, and an optional intlimit
, and prints up tolimit
results to stdout.format_worker_stats
: Takes in a worker id and set of previous worker stats, and returns the previous stats in the format(accepted_count | total_rejected_count (soft_rejected_count) / total count)
prompt_for_options
: Prompts the user fortask_name
,block_qualification
, andapprove_qualification
. If provided as an argument, skips. Returns these values after confirming with the user, and if blank usesNone
.
scripts.py
This file contains a few helper methods for running scripts relying on the MephistoDB
. They are as follows:
get_db_from_config
: This method takes in a hydra-producedDictConfig
containing aMephistoConfig
(such as aTaskConfig
), and returns an initializedMephistoDB
compatible with the configuration. Right now this exclusively leverages theLocalMephistoDB
.augment_config_from_db
: This method takes in aTaskConfig
and aMephistoDB
, parses the content to ensure that a valid requester and architect setup exists, and then updates the config. It also has validation steps that require user confirmation for live runs. It returns the updated config.load_db_and_process_config
: This is a convenience method that wraps the above two methods, loading in the appropriateMephistoDB
and using it to process the script. It returns the db and a valid config.process_config_and_get_operator
: A convenience wrapper of the above method that _also_ creates anOperator
too.task_script
: This decorator is used to register a Mephisto script for launching task runs. It takes in either aconfig
(TaskConfig
) ordefault_config_file
(yaml filename without the .yaml) argument to specify how the script is configured, and wraps a main that takes in anOperator
andDictConfig
as arguments.
View Source
#!/usr/bin/env python3 # Copyright (c) Meta Platforms and its affiliates. # This source code is licensed under the MIT license found in the # LICENSE file in the root directory of this source tree. """ .. include:: README.md """ __docformat__ = "restructuredtext"