Working with Functions

Eptalights breaks everything down into functions, along with their related context-like the class they belong to and the file where they’re defined. This forms the core structure of the function-level data model. The Data model or structure for function - FunctionModel

1. get total functions

Get total number of functions in database.

print("total_functions =", api.get_total_functions())

# output
"""
total_functions = 76937
"""

2. listing functions

Listing all functions along with their corresponding source filenames in the project.

for fn in api.search_functions():
    print(f"name={fn.name}, filepath={fn.filepath}")

# output
"""
name=main, filepath=/example/src/07_array.cc
name=main, filepath=/example/src/07_array.cc
name=main, filepath=/example/src/14_struct_arithmetic.cc
name=addNumbers, filepath=/example/src/14_struct_arithmetic.cc
"""

Each Function comes a unique ID called fid. The fid is named based on the file path, the function name and its function overloading count.

for fn in api.search_functions():
    print(f"fid={fn.fid}")

"""
fid=/example/src/07_array.cc:main#1
fid=/example/src/14_struct_arithmetic.cc:addNumbers#1
fid=/example/src/14_struct_arithmetic.cc:main#1
fid=/example/src/10_union.cc:main#1
fid=/example/src/03_scanf_to_malloc.cc:main#1
"""

3. get function by id

We can retrieve a function by its fid.

fn = api.get_function_by_id(fid="/example/src/07_array.cc:main#1")
print(f"name={fn.name}, filepath={fn.filepath}")

"""
name=main, filepath=/example/src/07_array.cc
"""

4. get functions by file path

Get all functions of a specific file path.

for fn in api.get_functions_by_filepath(filepath="/example/src/07_array.cc"):
    print(f"name={fn.name}, filename={fn.filepath}")

# output
"""
name=main, filename=/example/src/07_array.cc
"""

5. search functions

Searching functions by filter_by_name, filter_by_filepath or filter_by_classname.

for fn in api.search_functions(filter_by_name="main"):
    print(f"name={fn.name}, filepath={fn.filepath}")

# output
"""
name=main, filepath=/example/src/07_array.cc
name=main, filepath=/example/src/14_struct_arithmetic.cc
name=main, filepath=/example/src/10_union.cc
name=main, filepath=/example/src/03_scanf_to_malloc.cc
name=main, filepath=/example/src/16_buffer_overflow.cc
name=main, filepath=/example/src/02_pointer_arithmetic.cc
...[redacted]
"""

for fn in api.search_functions(filter_by_filepath="07_array.cc"):
    print(f"name={fn.name}, filepath={fn.filepath}")

# output
"""
name=main, filepath=/example/src/07_array.cc
"""

6. dumping function Pseudo-C code

Dumping the high level Pseudo-C code off the a function. Remember everything you see in the Pseudo-C dump can easily be accessed from the python API.

fn = api.get_function_by_id(fid="/example/src/07_array.cc:main#1")
print(fn.decompile())

# output
"""
main  (  )
{
        <bb 2> :
        arr[0] = 10;
        arr[1] = 20;
        arr[2] = 30;
        arr[3] = 40;
        arr[4] = 50;
        arr1[0] = 1;
        arr1[1] = 2;
        arr1[2] = 3;
        arr1[3] = 4;
        arr1[4] = 5;
        arr2[0] = 1.0e+0;
        i = 0;

        <bb 3> :
        if ( i > 4 )
                goto <bb 5>;
        else
                goto <bb 4>;

        <bb 4> :
        $T1 = i;
        $T2 = $T1;
        $T3 = $T2 * 2.100000000000000088817841970012523233890533447265625e+0;
        $T4 = $T3;
        arr2[i] = $T4;
        i = i + 1;

        <bb 5> :
        $T22 = 0;
        arr = R"({)"R"(CLOBBER)"R"(})";
        arr1 = R"({)"R"(CLOBBER)"R"(})";
        arr2 = R"({)"R"(CLOBBER)"R"(})";

        <bb 6> :
        nop;
        return $T22;
}
"""

Every data objects are basically Pydantic models in our Python Library. Therefore dumping the informal representation or __str__ will give you the Pydantic string representation. Also that means you have can take advance of all the Pydantic features like model_dumps, etc.

fn = api.get_function_by_id(fid="/example/src/07_array.cc:main#1")
print(fn)

fid='/example/src/07_array.cc:main#1' name='main' filepath='/example/src/07_array.cc' class_name=None variable_manager=VariableManagerModel(function_args=[], local_variables=['i', 'arr2', 'arr1', 'arr', '68952813'], tmp_variables=['$T1', '$T2', '$T3', '$T4', '$T22'], return_variables=['$T22'], variables={'i': VariableModel(vid='/example/src/07_array.cc:main:i', name='i', vartype=<VarType.LOCAL_VARIABLE: 'LOCAL_VARIABLE'>, unique_ssa_variables={'i_19': SSAVariableModel(ssa_name='i_19', ssa_version=19, variable_name='i', variable_defined_at_steps=[11], variable_used_at_steps=[], variable_used_in_callsites=[], record_attributes_defined_at_steps={}, record_attributes_used_at_steps={}, used_inside_other_tokenized_operand_tokens_at_step={}, tokenized_operands_defs_at_steps={11: [TokenizedOperandModel(operand_type=<TokenType.IS_UNDEF: 'IS_UNDEF'>, ssa_name='i_19', ssa_version=19, variable_name='i', step_index=11, position=1, used_inside_other_tokenized_operand_tokens_at_step={}, current_depth_position=0, tokens=[TokenModel(token_type=<TokenType.IS_VARIABLE: 'IS_VARIABLE'>, is_base_variable=True, code_name='ssa_name', value='i_19', value_extended='i', discovery_depth=0)])]}, tokenized_operands_uses_at_steps={}), 'i_5': SSAVariableModel(ssa_name='i_5', ssa_version=5, variable_name='i', variable_defined_at_steps=[], variable_used_at_steps=[12, 13, 17, 18], variable_used_in_callsites=[], ...[redacted]...