Version and Platform (required):
- Binary Ninja Version: 5.4.9696-dev Ultimate (c8719213)
- Edition: Ultimate
- OS: macOS
- OS Version: 26.4
- CPU Architecture: M1
Bug Description:
There's a bug in Binary Ninja involving auto stack variables where the HLIL won't display the field assignments correctly for the auto stack variable type. Instead it treats the stack variable type as the type that Binary Ninja is guessing it is. So if the stack variable is passed to a function it might infer the type from the function parameter and then use that for the HLIL assignment expressions rather than the auto stack variable type. The screenshots below do a better job of explaining this.
Steps To Reproduce:
Please provide all steps required to reproduce the behavior:
- Compile the native plugin code below and load it into Binary Ninja
- Compile the Objective-C code (it didn't need to be Objective-C, I was just re-using a test file) below using the compile command
- Open the Objective-C binary in Binary Ninja with options and select the native plugin function workflow
- Wait for initial analysis to complete and observe the issue in the function
_CallsTakesSomeParentStruct
- You should be able to follow along with whats said with the screenshots below
Expected Behavior:
The final screenshot without having to manually apply the type as the user or using a user stack variable instead of an auto stack variable.
Screenshots/Video Recording:
After initial analysis has completed this is what the functions defined in the source code below look like:
As you can see the function workflow provided below is doing its job and defining the types and then applying it to an auto stack variable. However none of the assignments to the field following the initial declaration are being shown correct. Binary Ninja doesn't really have much to go off of when it comes to inferring the type of stackVar because its ignoring the auto stack variable.
If I now set the correct type for _TakesSomeParentStruct then the HLIL will change:
This shows more clearly how Binary Ninja is completely ignore the stack variable type when it comes to determining the representation for its HLIL and instead purely relying on what it can infer of the stack variable type from other analysis. Now its assuming its got the type SomeParentStruct its doing the assignments to that type's fields but still omitting the HLIL for the assignments to field_3 and field_4 in SomeChildStruct.
Now if I manually apply the type to stackVar the HLIL becomes what you would expect:
Binary:
I use the following code to generate an executable to provide the above screenshots:
#import <Foundation/Foundation.h>
struct SomeParentStruct {
uint64_t field1;
uint64_t field2;
};
struct SomeChildStruct {
struct SomeParentStruct parent;
uint64_t field3;
uint64_t field4;
};
static void TakesSomeParentStruct(struct SomeParentStruct* input)
{
input->field1 = 0;
input->field2 = 1;
}
static void CallsTakesSomeParentStruct() {
struct SomeChildStruct instance = {
.parent = {
.field1 = 1,
.field2 = 2,
},
.field3 = 1,
.field4 = 2,
};
TakesSomeParentStruct(&instance.parent);
}
int main(int argc, char *argv[]) {
CallsTakesSomeParentStruct();
return 0;
}
Compiled it with:
clang -fno-objc-arc -framework Foundation -O0 test.m -o test
This is the code for the native plugin I wrote to show the issue:
#include "binaryninjaapi.h"
#include <limits>
#include <mediumlevelilinstruction.h>
using namespace BinaryNinja;
static const std::string SourceForTypes = R"#(
struct SomeParentStruct {
uint64_t field1;
uint64_t field2;
};
struct SomeChildStruct {
struct SomeParentStruct parent;
uint64_t field3;
uint64_t field4;
};
)#";
static void Action(Ref<AnalysisContext> ctx)
{
const auto func = ctx->GetFunction();
const auto bv = func->GetView();
if (func->GetSymbol()->GetRawName() != "_CallsTakesSomeParentStruct")
return;
const auto mlil = ctx->GetMediumLevelILFunction();
if (!mlil)
return;
QualifiedName stackVarTypeName("SomeChildStruct");
auto type = bv->GetTypeByName(stackVarTypeName);
if (!type) {
std::vector<std::string> options;
std::vector<std::string> includeDirs;
TypeParserResult result;
std::string errors;
if (!bv->ParseTypesFromSource(SourceForTypes, options, includeDirs, result, errors)) {
LogErrorF("Failed to parse types: {}", errors);
return;
}
for (const auto& type : result.types)
bv->DefineType(type.name.GetString(), type.name, type.type);
}
type = bv->GetTypeByName(stackVarTypeName);
if (!type) {
LogErrorF("Type was not defined");
return;
}
func->CreateAutoStackVariable(-0x30, type->WithConfidence(std::numeric_limits<uint8_t>::max()), "stackVar");
}
extern "C" {
BN_DECLARE_CORE_ABI_VERSION
static constexpr auto FunctionWorkflowInfo = R"#({
"title": "Issue ????",
"description": "",
"targetType" : "function"
})#";
BINARYNINJAPLUGIN bool CorePluginInit()
{
const auto functionWorkflow = Workflow::Get("core.function.metaAnalysis")->Clone("core.function.issue");
const auto config = R"#({"name":"core.function.issue.auto-stack-issue","title":"Causes bad rendering in HLIL of auto stack variables","description":"","role":"action","eligibility":{"auto": {},"predicates": [{"type": "viewType", "value": ["DSCView","Mach-O"], "operator": "in"}]}})#";
const auto activity = new Activity(config, &Action);
functionWorkflow->RegisterActivity(activity);
functionWorkflow->InsertAfter("core.function.generateMediumLevelIL", "core.function.issue.auto-stack-issue");
Workflow::RegisterWorkflow(functionWorkflow, FunctionWorkflowInfo);
return true;
}
}
You need to set the function workflow to core.function.issue when opening the executable.
Additional Information:
I'm finding this is a general theme with auto related functionality, Binary Ninja just doesn't care or respect any auto data variables I create or auto function types I apply. I always end up having to use user over auto. What I mean by "doesn't care or respect" is that the auto data variables just get overridden by random analysis where its decided actually the struct based data variable you created is irrelevant and there's a piece of code doing a 4 byte read in the middle of the structure so no struct data variable just a single 4 byte integer variable. Any auto function types I apply just don't do anything at all. Its confusing though because it seems with things like auto stack variables you must create an auto stack variable on every execution of the workflow, not just once, which I guess makes sense, but is that expected to be the case with function or data variables? Even so it seems unlikely it will fix the problem (pretty sure I've tried before) but also that doesn't work with the model where a function type or a data variable is determined by doing analysis of metadata in a binary section which you're only to do once, not every time a function is analysed. I think in general these APIs are confusing (I recognise that Vector35 is aware of this) but also I can't tell if I'm using the APIs wrong or if they're just buggy and it seems like it could be both.
Generally at this point I mostly give up on using auto APIs because of these issues but it does degrade the user experience. For instance if I use a user stack variable here I can't apply it every execution of the function workflow otherwise the user can't override it. So if the user decides to override it and then decides they're wrong and undefines the stack variable then all type information is going and Binary Ninja will fallback to its best guess. In theory if auto stack variables were working correctly the user can override it and then if they decide they're wrong when they undefine it, it should fallback to the auto stack variable. That to me is part of the benefit of auto vs user, providing a plugin a way to automatically inform Binary Ninja to improve its output but allowing the user to override that.
Version and Platform (required):
Bug Description:
There's a bug in Binary Ninja involving auto stack variables where the HLIL won't display the field assignments correctly for the auto stack variable type. Instead it treats the stack variable type as the type that Binary Ninja is guessing it is. So if the stack variable is passed to a function it might infer the type from the function parameter and then use that for the HLIL assignment expressions rather than the auto stack variable type. The screenshots below do a better job of explaining this.
Steps To Reproduce:
Please provide all steps required to reproduce the behavior:
_CallsTakesSomeParentStructExpected Behavior:
The final screenshot without having to manually apply the type as the user or using a user stack variable instead of an auto stack variable.
Screenshots/Video Recording:
After initial analysis has completed this is what the functions defined in the source code below look like:
As you can see the function workflow provided below is doing its job and defining the types and then applying it to an auto stack variable. However none of the assignments to the field following the initial declaration are being shown correct. Binary Ninja doesn't really have much to go off of when it comes to inferring the type of
stackVarbecause its ignoring the auto stack variable.If I now set the correct type for
_TakesSomeParentStructthen the HLIL will change:This shows more clearly how Binary Ninja is completely ignore the stack variable type when it comes to determining the representation for its HLIL and instead purely relying on what it can infer of the stack variable type from other analysis. Now its assuming its got the type
SomeParentStructits doing the assignments to that type's fields but still omitting the HLIL for the assignments tofield_3andfield_4inSomeChildStruct.Now if I manually apply the type to
stackVarthe HLIL becomes what you would expect:Binary:
I use the following code to generate an executable to provide the above screenshots:
Compiled it with:
clang -fno-objc-arc -framework Foundation -O0 test.m -o testThis is the code for the native plugin I wrote to show the issue:
You need to set the function workflow to
core.function.issuewhen opening the executable.Additional Information:
I'm finding this is a general theme with
autorelated functionality, Binary Ninja just doesn't care or respect any auto data variables I create or auto function types I apply. I always end up having to use user over auto. What I mean by "doesn't care or respect" is that the auto data variables just get overridden by random analysis where its decided actually the struct based data variable you created is irrelevant and there's a piece of code doing a 4 byte read in the middle of the structure so no struct data variable just a single 4 byte integer variable. Any auto function types I apply just don't do anything at all. Its confusing though because it seems with things like auto stack variables you must create an auto stack variable on every execution of the workflow, not just once, which I guess makes sense, but is that expected to be the case with function or data variables? Even so it seems unlikely it will fix the problem (pretty sure I've tried before) but also that doesn't work with the model where a function type or a data variable is determined by doing analysis of metadata in a binary section which you're only to do once, not every time a function is analysed. I think in general these APIs are confusing (I recognise that Vector35 is aware of this) but also I can't tell if I'm using the APIs wrong or if they're just buggy and it seems like it could be both.Generally at this point I mostly give up on using auto APIs because of these issues but it does degrade the user experience. For instance if I use a user stack variable here I can't apply it every execution of the function workflow otherwise the user can't override it. So if the user decides to override it and then decides they're wrong and undefines the stack variable then all type information is going and Binary Ninja will fallback to its best guess. In theory if auto stack variables were working correctly the user can override it and then if they decide they're wrong when they undefine it, it should fallback to the auto stack variable. That to me is part of the benefit of auto vs user, providing a plugin a way to automatically inform Binary Ninja to improve its output but allowing the user to override that.