Skip to content

Add graph reinit on failure#4237

Open
atobiszei wants to merge 11 commits into
mainfrom
atobisze_mp_reinit_graph
Open

Add graph reinit on failure#4237
atobiszei wants to merge 11 commits into
mainfrom
atobisze_mp_reinit_graph

Conversation

@atobiszei
Copy link
Copy Markdown
Collaborator

No description provided.

Copilot AI review requested due to automatic review settings May 22, 2026 14:12
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds automatic MediaPipe graph reinitialization on inference failure when using the graph queue/pool, to prevent returning an errored (“poisoned”) graph instance back to the pool for reuse.

Changes:

  • Introduces GraphHelper::reinitialize() plus an RAII GraphReinitGuard that rebuilds a fresh CalculatorGraph unless dismissed on the success path.
  • Hooks the guard into queue-based unary and streaming inference paths to trigger graph rebuild after failures.
  • Moves per-graph shutdown logic into GraphHelper’s destructor and simplifies GraphQueue destructor.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
src/mediapipe_internal/mediapipegraphexecutor.hpp Adds GraphReinitGuard around queue-path inference and dismisses it on success.
src/mediapipe_internal/graphqueue.hpp Adds GraphHelper destructor, reinitialize() declaration, and new GraphReinitGuard type.
src/mediapipe_internal/graphqueue.cpp Implements GraphHelper shutdown + reinit logic; defaults GraphQueue destructor.
Comments suppressed due to low confidence (2)

src/mediapipe_internal/graphqueue.cpp:142

  • reinitialize() sets graph to nullptr on ObserveOutputStream failure (graph.reset()). The next request acquiring this GraphHelper will dereference graphHelper->graph in GraphIdGuard (graph(*graphHelper->graph)), which can crash if graph is null. Ensure reinitialize() always leaves a non-null graph, or update acquisition logic to detect nullptr and rebuild/replace before use.
        if (!absStatus.ok()) {
            SPDLOG_ERROR("Graph reinitialize: ObserveOutputStream failed: {}", absStatus.ToString());
            graph.reset();
            return;
        }

src/mediapipe_internal/graphqueue.cpp:162

  • reinitialize() sets graph to nullptr on StartRun failure (graph.reset()). The next request acquiring this GraphHelper will dereference graphHelper->graph in GraphIdGuard (graph(*graphHelper->graph)), which can crash if graph is null. Ensure reinitialize() always leaves a non-null graph, or update acquisition logic to detect nullptr and rebuild/replace before use.
    if (!absStatus.ok()) {
        SPDLOG_ERROR("Graph reinitialize: StartRun failed: {}", absStatus.ToString());
        graph.reset();
        return;
    }

Comment on lines +93 to +95
~GraphReinitGuard() {
if (!dismissed) {
helper.reinitialize(config, sidePacketMaps);
Comment on lines +110 to +114
void GraphHelper::reinitialize(const ::mediapipe::CalculatorGraphConfig& config, const GraphSidePackets& sidePacketMaps) {
SPDLOG_DEBUG("Reinitializing graph after error");
// Tear down the old graph (best-effort, errors expected since graph is in bad state)
if (this->graph) {
auto absStatus = this->graph->CloseAllPacketSources();
Comment on lines +128 to +132
auto absStatus = graph->Initialize(config);
if (!absStatus.ok()) {
SPDLOG_ERROR("Graph reinitialize: Initialize failed: {}", absStatus.ToString());
graph.reset();
return;
graph->Cancel();
}

void GraphHelper::reinitialize(const ::mediapipe::CalculatorGraphConfig& config, const GraphSidePackets& sidePacketMaps) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unit test that will ensure failed graphs are reusable is missing

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added test

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 10 out of 10 changed files in this pull request and generated 4 comments.

Comment on lines +93 to +97
~GraphReinitGuard() {
if (!dismissed) {
helper.reinitialize(config, sidePacketMaps);
}
}
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case we could only log error anyway. Will add that.

Comment on lines +128 to +133
auto absStatus = graph->Initialize(config);
if (!absStatus.ok()) {
SPDLOG_ERROR("Graph reinitialize: Initialize failed: {}", absStatus.ToString());
graph.reset();
return;
}
Comment on lines 138 to 142
if (!absStatus.ok()) {
SPDLOG_DEBUG("Graph queue WaitUntilDone error: {}", absStatus.ToString());
SPDLOG_ERROR("Graph reinitialize: ObserveOutputStream failed: {}", absStatus.ToString());
graph.reset();
return;
}
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will remove graph.reset.

Comment on lines +157 to +162
absStatus = graph->StartRun(inputSidePackets);
if (!absStatus.ok()) {
SPDLOG_ERROR("Graph reinitialize: StartRun failed: {}", absStatus.ToString());
graph.reset();
return;
}
Comment thread src/logging.cpp
#endif
if (log_level == "DEBUG" || log_level == "TRACE") {
#ifdef _WIN32
_putenv_s("OPENVINO_LOG_LEVEL", "4");
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not using guard utils since we dont want cyclic dependency on logging.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants