Architects, Routers, and
mephisto-task: The Architect API.
The Architect API is what allows Mephisto to completely abstract away the process of getting workers to operate in Mephisto tasks. In short, it aims to cover 4 primary functions:
Agentregistration and validation
- Seamless frontend access to
Unitdata made available on the Python side both when accepting and working on a task.
- Method to submit completed task data to the backend when a task is complete.
- Worker liveliness and status checking and syncing.
We also have a few additional goals, which influence design decisions:
- Tasks without the need for live interaction with the backend should be REST-ful rather than socket-based. This prevents the need to be fully connected throughout the task, reducing unexpected disconnects.
- The system should include failsafes to allow for retries.
- We should prioritize low latency where possible.
- The interaction layer should be simple.
This document aims to describe how we got from these requirements and goals to the system we have now.
Mephisto allows for multiple stages of filtering for a worker, which leads the registration process to be staged. The complete order is listed below:
CrowdProvider-set qualifications may prevent a worker from even seeing a task.
Qualifications may be set locally that a
CrowdProvideris not aware of. Workers not meeting these Mephisto qualifications (as set by
SharedState.qualifications) should be filtered out.
- Configuration requirements may be set, like
allowed_concurrentwhich could prevent a worker from completing more than a maximum number of units on a task or work on too many tasks at once respectively.
- Of all available
Assignments the worker hasn't yet worked on, workers may be further filtered by the user-provided
This also provides a template of the failure conditions for a worker to be made aware of:
Workerdoesn't meet the qualifications for a task.
Workeris working on too many tasks at a time.
Workerhas completed the maximum amount of units for the given task, as set by the requester.
- None of the currently available units are available for
Frontend access to backend
Mephisto must provide users with two key ways to provide data to a worker:
Assignment, which define the data made available for a
Uniton the frontend, and which may also be used to duplicate units.
- Providing additional data during a live task, including derived data from partial work. These should be both available via pull and push mechanisms.
Covering these two areas ensures that it's possible to create a broad variety of task types.
Task data submission
Like backend data, completed task data should be able to be either tracked during the course of a task or submitted at the end, or both. This leads to two main event types:
- Posting completed task data for a task during one of the completion points (onboarding, completing main
- Sending arbitrary task data at any point during a task (for data that needs a response, or longer forms of logging)
Worker Liveliness and Status
Over the course of a task, there are a few key states to consider:
- Waiting (for other
Units in an assignment)
- Completed (task complete, or partner disconnects)
- Disconnected (Server disconnect, timeout, return)
- Failed to connect (no available tasks)
We've also considered a "post-task" state after the completion of a task for surveys or related content.
The routing server is responsible for keeping track of the liveliness of individual
Agents. If it observes a disconnect on the socket, as well as timeouts on heartbeat packets.
Certain status transitions will come in from the main server, and the router may be responsible for cleaning up local state or caching results at this stage.
RESTful vs Socket interactions
We've divided Mephisto tasks into two primary types,
live tasks. The former shouldn't require backend access through the majority of the task, only during key points (starting, submission), while the latter can have direct communication throughout the task.
Beyond just being simpler to implement,
static tasks also have the advantage of being lenient on worker behavior; if a worker suspends progress and returns within the timeout window, they aren't penalized, even if their machine were to sleep during that window.
The Mephisto backend channels expect to communicate with the router in a certain way. Our primary
Channel is the
WebsocketChannel, and as such we expect to receive
Packets over the wire from the routing server.
The core carrier for information in the Mephisto Architect API is the
Packet class. Downstream, they act as the way to tie an
Agent class in python to an actual human worker.
Channel is the primary way of trasmitting packets, with the
WebsocketChannel being the main implementation Mephisto currently uses with its
ClientIOHandler is responsible for using and interpreting packets, so it defines the key types to be handled.
There are a number of different packet types used by Mephisto:
PACKET_TYPE_ALIVE: Used to mark the success of a new socket connection.
PACKET_TYPE_SUBMIT_ONBOARDING: Used to handle submission of onboarding.
PACKET_TYPE_SUBMIT_UNIT: Used to handle submission of a
PACKET_TYPE_CLIENT_BOUND_LIVE_UPDATE: Used to send any new live data to an agent.
PACKET_TYPE_MEPHISTO_BOUND_LIVE_UPDATE: Used for the frontend to send any type of data to the backend, usually to be processed by a user-defined callback.
PACKET_TYPE_REGISTER_AGENT: Used to request a new
Unitdata for a specific worker on a task.
PACKET_TYPE_AGENT_DETAILS: Used to respond with the details of a worker registration request.
PACKET_TYPE_UPDATE_STATUS: Used by Mephisto to push a status update to the router and frontend worker.
PACKET_TYPE_REQUEST_STATUSES: Used by Mephisto to poll for the current statuses for a worker.
PACKET_TYPE_RETURN_STATUSES: Used by the router to return updates for all of the currently registered agents.
PACKET_TYPE_ERROR: Used by the router and frontend to communicate to the python backend that an error has occurred.
While this is the "Architect API" most of the responsibilities for the architect are merely pointing the
ClientIOHandler to the correct
Channels for sending packets for a given client. Ultimately it is the
ClientIOHandler that dictates the responsibilities that the transmitted messages carry:
ClientIOHandlermust send a
PACKET_TYPE_ALIVEwhenever it opens a new channel (in this case, to a router).
UnitRegistration, in response to a
PACKET_TYPE_REGISTER_AGENT, the handler must return a
PACKET_TYPE_AGENT_DETAILSwith the details of an agent and it's initialization data, or the failure status for why an agent couldn't be created.
- During a
Unitthe handler must process
PACKET_TYPE_MEPHISTO_BOUND_LIVE_UPDATEand direct the content to the correct handlers, and should send a
Agent.send_data()call on a live connected
Agent. The handler must also process
PACKET_TYPE_SUBMIT_*packets for the key transitions of a
Unitin progress, and should respond with
PACKET_TYPE_AGENT_DETAILSfor a submit on an
- Over any run, the handler should poll with
PACKET_TYPE_REQUEST_STATUSand update local
Agentstatuses on disconnects from
PACKET_TYPE_RETURN_STATUSES. This also acts as a heartbeat from the Python core to the router. The handler should also take
PACKET_TYPE_ERRORand log the contents if this ever occurs.
The primary responsiblity of the router is to take incoming packets from client connections and direct them to the core Mephisto
ClientIOHandler and to do the reverse as well. All packets will have a core
agent_id field denoting either the sender or receiver of the packet, depending on the packet type. The only exception is the
PACKET_TYPE_ALIVE, which is directed to the router and allows for any registration of an incoming connection.
Secondarily, the router is responsible for converting RESTful
POST requests from
mephisto-task into socket messages, and relaying the response as a standard
POST response. This behavior is only for the
PACKET_TYPE_SUBMIT_ONBOARDING packets, and both of them will be serviced by
PACKET_TYPE_AGENT_DETAILS responses. For these it should be listening to
POST requests at
POST requests to
/log_error should result in forwarding a
Third, the router is responsible for maintaining track of agent status, and acting as a cache for this information after disconnects. This allows for a worker to return to a task and have updated information about what has transpired, even when the main Mephisto server has cleaned up the related
TaskRunner and live
Fourth, the router is responsible for serving the static
task_config.json file, which allows the frontend to load certain details about the full task before going through any registration handshakes.
useMephistoTask hook is responsible for allowing a worker to connect to a task and submit the relevant data. For this, it only needs to make
POST requests related to the
PACKET_TYPE_REGISTER_AGENT events. The former should be triggered on
handleSubmit, while the latter should trigger immediately on load.
useMephistoLiveTask hook is responsible for the rest of the packets. Data packets should be sent via
sendData and handled with the
onLiveUpdate callback. So long as your data is json-serializable, you can send anything you want this way.
We also provide a
useMephistoRemoteProcedureTask hook, which is a wrapper around
useMephistoLiveTask that instead allows for making remote procedure calls from static tasks (when combined with the
RemoteProcedureBlueprint or a similar API). Here people can make requests to the backend from an otherwise static task, and potentially receive responses and take action on them if they've registered callbacks. The only interface here is thus