API Reference

Function

class eptalights_code.models.sophia_ir.function.FunctionModel(*, fid: str, name: str, filepath: str, class_name: str | None = None, class_props: Dict[str, TokenizedOperandModel] = {}, variable_manager: VariableManagerModel | None = VariableManagerModel(function_args=[], local_variables=[], tmp_variables=[], return_variables=[], variables={}), callsite_manager: CallsiteManagerModel | None = CallsiteManagerModel(step_callsites={}, unique_callsites={}, callsites={}), cfg: ControlFlowGraphModel | None = ControlFlowGraphModel(basicblock_exit_nodes=[], basicblock_steps={}, basicblock_edges={}), steps: List[SophiaIRNopModel | SophiaIRLabelModel | SophiaIRAssignModel | SophiaIRCallModel | SophiaIRCondModel | SophiaIRReturnModel | SophiaIRGotoModel | SophiaIRSwitchModel] = [])

Represents a function within a program analysis context.

Attributes

fidstr

A unique identifier for the function.

namestr

The name of the function.

filepathstr

The file path where the function is defined.

class_nameOptional[str], optional

The name of the class containing this function (if applicable). Defaults to None.

class_propsDict[str, TokenizedOperandModel], optional

A dictionary mapping class property names to their tokenized representations. Defaults to an empty dictionary.

variable_managerOptional[VariableManagerModel], optional

An instance managing the variables within the function scope. Defaults to an empty VariableManagerModel.

callsite_managerOptional[CallsiteManagerModel], optional

An instance managing the function’s call sites. Defaults to an empty CallsiteManagerModel.

cfgOptional[ControlFlowGraphModel], optional

The control flow graph representing the function’s execution flow. Defaults to an empty ControlFlowGraphModel.

stepsList[Union[SophiaIRNopModel, SophiaIRLabelModel,

SophiaIRAssignModel, SophiaIRCallModel, SophiaIRCondModel, SophiaIRReturnModel, SophiaIRGotoModel, SophiaIRSwitchModel]]

A list of steps representing the function’s operations in the intermediate representation (IR). Each step corresponds to an operation such as assignment, call, condition, return, etc. Defaults to an empty list.

decompile()

Generate a human-readable or high-level representation.

Returns

str

A string representation of the expression.

classmethod set_steps(v)

Validate and convert steps into their respective IR models.

Parameters

vlist

A list of step dictionaries containing operation details.

Returns

list

A list of instantiated IR step models corresponding to the operation types.

Raises

Exception

If an unknown operation type is encountered.

Step or Instruction

class eptalights_code.models.sophia_ir.function.SophiaIRNopModel(*, step_index: int | None = None, lineno: int = -1, basicblock_index: int | None = None, low_level_steps: List[int] = [], variables_defined_here: List[str] = [], variables_used_here: List[str] = [], ssa_variables_defined_here: List[str] = [], ssa_variables_used_here: List[str] = [], op: OpType = OpType.NOP)

Represents a NOP (No Operation) instruction in the SOPHIA IR model.

Attributes

opOpType, optional

The operation type associated with this instruction. Defaults to OpType.NOP.

decompile()

Generate a human-readable or high-level representation.

Returns

str

A string representation of the expression.

update_variables_defined_and_used_here()

Update the variables defined and used in this NOP instruction.

This method is currently a placeholder and does not modify any state. It is intended to be overridden or extended in future implementations to handle variable tracking specific to NOP operations.

class eptalights_code.models.sophia_ir.function.SophiaIRAssignModel(*, step_index: int | None = None, lineno: int = -1, basicblock_index: int | None = None, low_level_steps: List[int] = [], variables_defined_here: List[str] = [], variables_used_here: List[str] = [], ssa_variables_defined_here: List[str] = [], ssa_variables_used_here: List[str] = [], op: OpType = OpType.ASSIGN, src: ExprModel, dst: TokenizedOperandModel)

Represents an assignment operation in the SOPHIA IR.

Attributes

opOpType, optional

The operation type, which defaults to OpType.ASSIGN.

srcExprModel

The source expression being assigned.

dstTokenizedOperandModel

The destination operand that receives the assignment.

decompile()

Generate a human-readable or high-level representation.

Returns

str

A string representation of the expression.

property defined_tokenized_operands: List[TokenizedOperandModel]

Retrieve the list of operands that are defined in this assignment.

Returns

List[TokenizedOperandModel]

A list containing the destination operand.

update_operand_step_index() None

Update the step index for all operands involved in this assignment.

This ensures that the step index of the destination and source operands aligns with the step index of the assignment operation.

update_variables_defined_and_used_here() None

Update the lists of variables defined and used in this assignment.

This method categorizes tokens into ssa_variables_defined_here and ssa_variables_used_here based on whether they are defined or used in this assignment operation.

property used_tokenized_operands: List[TokenizedOperandModel]

Retrieve the list of operands used in this assignment.

Returns

List[TokenizedOperandModel]

A list of operands used in the right-hand side of the assignment.

class eptalights_code.models.sophia_ir.function.SophiaIRCallModel(*, step_index: int | None = None, lineno: int = -1, basicblock_index: int | None = None, low_level_steps: List[int] = [], variables_defined_here: List[str] = [], variables_used_here: List[str] = [], ssa_variables_defined_here: List[str] = [], ssa_variables_used_here: List[str] = [], op: OpType = OpType.CALL, fname: str, fname_tokenized: TokenizedOperandModel | None = None, finfo: str | None = None, fargs: List[TokenizedOperandModel] = [], dst: TokenizedOperandModel | None = None)

Represents a function call operation in the SOPHIA IR.

Attributes

opOpType

The operation type, which is always set to OpType.CALL.

fnamestr

The name of the function being called.

fname_tokenizedOptional[TokenizedOperandModel], optional

A tokenized representation of the function name, if available. Defaults to None.

finfoOptional[str], optional

Additional function information, if available. Defaults to None.

fargsList[TokenizedOperandModel], optional

A list of tokenized operands representing the function arguments. Defaults to an empty list.

dstOptional[TokenizedOperandModel], optional

The destination operand where the function’s return value is stored. Defaults to None.

decompile()

Generate a human-readable or high-level representation.

Returns

str

A string representation of the expression.

property defined_tokenized_operands: List[TokenizedOperandModel]

Return a list of tokenized operands defined by this call operation.

Returns

List[TokenizedOperandModel]

A list containing the destination operand, if defined.

update_operand_step_index() None

Update the step index of all function arguments to match the current step index.

update_variables_defined_and_used_here() None

Update the lists of defined and used SSA variables based on the function call.

This method populates ssa_variables_defined_here and variables_defined_here for newly defined variables, and ssa_variables_used_here and variables_used_here for referenced variables.

property used_tokenized_operands: List[TokenizedOperandModel]

Return a list of tokenized operands used as arguments in the call.

Returns

List[TokenizedOperandModel]

A list of operands used in the function call, filtering only those that represent variables.

class eptalights_code.models.sophia_ir.function.SophiaIRCondModel(*, step_index: int | None = None, lineno: int = -1, basicblock_index: int | None = None, low_level_steps: List[int] = [], variables_defined_here: List[str] = [], variables_used_here: List[str] = [], ssa_variables_defined_here: List[str] = [], ssa_variables_used_here: List[str] = [], op: OpType = OpType.COND, src: ExprModel, true_dst_block_index: int | None = None, false_dst_block_index: int | None = None)

Represents a conditional expression in the SOPHIA IR.

Attributes

opOpType

The operation type, set to OpType.COND by default.

srcExprModel

The conditional expression being evaluated.

true_dst_block_indexOptional[int], optional

The index of the block to jump to if the condition is true. Defaults to None.

false_dst_block_indexOptional[int], optional

The index of the block to jump to if the condition is false. Defaults to None.

decompile()

Generate a human-readable or high-level representation.

Returns

str

A string representation of the expression.

property defined_tokenized_operands: list

Return an empty list since conditional expressions do not define new operands.

Returns

list

An empty list.

update_operand_step_index() None

Update the step index for operands in the conditional expression.

This method sets the step_index of the left-hand side (lhs) operand and, if present, the right-hand side (rhs) operand.

update_variables_defined_and_used_here() None

Update the lists of SSA variables and regular variables used in this condition.

This method ensures that variables appearing in the conditional expression are tracked in ssa_variables_used_here and variables_used_here, avoiding duplicates.

property used_tokenized_operands: list

Return a list of tokenized operands used in the conditional expression.

Returns

list

A list of variable operands used in the condition.

class eptalights_code.models.sophia_ir.function.SophiaIRReturnModel(*, step_index: int | None = None, lineno: int = -1, basicblock_index: int | None = None, low_level_steps: List[int] = [], variables_defined_here: List[str] = [], variables_used_here: List[str] = [], ssa_variables_defined_here: List[str] = [], ssa_variables_used_here: List[str] = [], op: OpType = OpType.RETURN, dst: TokenizedOperandModel | None = None)

Represents a RETURN operation in the SOPHIA IR.

Attributes

opOpType

The operation type, set to OpType.RETURN by default.

dstOptional[TokenizedOperandModel], optional

The destination operand representing the return value. Defaults to None.

decompile()

Generate a human-readable or high-level representation.

Returns

str

A string representation of the expression.

property defined_tokenized_operands

Retrieve the operands defined by this RETURN operation.

Returns

list

An empty list, as RETURN does not define any new variables.

update_operand_step_index()

Update the step index for the return operand.

update_variables_defined_and_used_here()

Update the lists of SSA and regular variables used at this RETURN operation.

property used_tokenized_operands

Retrieve the operands used by this RETURN operation.

Returns

list of TokenizedOperandModel

A list containing the return operand if it is a variable.

class eptalights_code.models.sophia_ir.function.SophiaIRGotoModel(*, step_index: int | None = None, lineno: int = -1, basicblock_index: int | None = None, low_level_steps: List[int] = [], variables_defined_here: List[str] = [], variables_used_here: List[str] = [], ssa_variables_defined_here: List[str] = [], ssa_variables_used_here: List[str] = [], op: OpType = OpType.GOTO, dst: TokenizedOperandModel | None = None, goto_label_names: List[str] = [], goto_basic_blocks: List[int] = [])

Represents a SOPHIA IR ‘goto’ statement.

Attributes

opOpType

The operation type, set to OpType.GOTO by default.

dstTokenizedOperandModel, optional

The destination operand for the goto statement. Defaults to None.

goto_label_namesList[str], optional

A list of possible label names the goto statement may jump to. Defaults to an empty list.

goto_basic_blocksList[int], optional

A list of possible basic block indices the goto statement may target. Defaults to an empty list.

Notes

  • If the goto statement targets a variable (e.g., goto varname), it may resolve to multiple labels, similar to a switch statement.

  • If it targets a constant (e.g., goto constant), it resolves to a single static label.

decompile()

Generate a human-readable or high-level representation.

Returns

str

A string representation of the expression.

property defined_tokenized_operands: list

Return an empty list, as ‘goto’ does not define any operands.

Returns

list

An empty list.

update_variables_defined_and_used_here() None

Update variables defined and used at this ‘goto’ statement.

Notes

Since ‘goto’ does not define or use any variables, this method is a no-op.

property used_tokenized_operands: list

Return an empty list, as ‘goto’ does not use any operands.

Returns

list

An empty list.

class eptalights_code.models.sophia_ir.function.SophiaIRSwitchModel(*, step_index: int | None = None, lineno: int = -1, basicblock_index: int | None = None, low_level_steps: List[int] = [], variables_defined_here: List[str] = [], variables_used_here: List[str] = [], ssa_variables_defined_here: List[str] = [], ssa_variables_used_here: List[str] = [], op: OpType = OpType.SWITCH, switch_index: TokenizedOperandModel, switch_cases: List[TokenizedOperandModel] | None = [], switch_label_names: List[str] | None = [], switch_basic_blocks: List[int] | None = [])

Represents a SWITCH operation in the SOPHIA IR model.

Attributes

opOpType

The operation type, set to OpType.SWITCH.

switch_indexTokenizedOperandModel

The operand representing the index or variable used in the switch condition.

switch_casesOptional[List[TokenizedOperandModel]], optional

A list of operand models representing the case values in the switch statement. Defaults to an empty list.

switch_label_namesOptional[List[str]], optional

A list of label names associated with each case. Defaults to an empty list.

switch_basic_blocksOptional[List[int]], optional

A list of basic block indices corresponding to each case. Defaults to an empty list.

decompile()

Generate a human-readable or high-level representation.

Returns

str

A string representation of the expression.

property defined_tokenized_operands: List[TokenizedOperandModel]

Retrieve a list of tokenized operands defined at this switch statement.

Returns

List[TokenizedOperandModel]

An empty list since a switch statement does not define new operands.

update_operand_step_index() None

Update the step_index attribute for all operands in this switch statement.

update_variables_defined_and_used_here() None

Update the lists of SSA and regular variables used in this switch statement.

property used_tokenized_operands: List[TokenizedOperandModel]

Retrieve a list of tokenized operands used in this switch statement.

Returns

List[TokenizedOperandModel]

A list of operands used in the switch index and case values, if they are variables.

class eptalights_code.models.sophia_ir.function.ExprModel(*, expr_type: ExprType = ExprType.UNDEF, lhs: TokenizedOperandModel, rhs: TokenizedOperandModel | None = None)

Represents an expression in a program analysis context.

Attributes

expr_typeExprType, optional

The type of the expression. Defaults to ExprType.UNDEF.

lhsTokenizedOperandModel

The left-hand side operand of the expression.

rhsTokenizedOperandModel, optional

The right-hand side operand of the expression, if applicable. Defaults to None for unary expressions.

decompile()

Generate a human-readable or high-level representation.

Returns

str

A string representation of the expression.

Callsite

class eptalights_code.models.sophia_ir.callsite.CallsiteModel(*, cid: str | None = None, step_index: int, fn_name: List[str], num_of_args: int = 0, variables_used_as_callsite_arg: List[str] = [], variables_defined_here: List[str] = [], ssa_variables_used_as_callsite_arg: List[str] = [], ssa_variables_defined_here: List[str] = [])

Represents a function call site.

Attributes

cidOptional[str], optional

The unique identifier for the call site. Defaults to None.

step_indexint

The index of the step in which this call occurs.

fn_nameList[str]

The name of the function being called, stored as a list of strings.

num_of_argsint, optional

The number of arguments passed to the function call. Defaults to 0.

variables_used_as_callsite_argList[str], optional

A list of variable names that are used as arguments at the call site. Defaults to an empty list.

variables_defined_hereList[str], optional

A list of variable names that are defined at the call site. Defaults to an empty list.

ssa_variables_used_as_callsite_argList[str], optional

A list of SSA (Static Single Assignment) variables used as arguments. Defaults to an empty list.

ssa_variables_defined_hereList[str], optional

A list of SSA variables defined at the call site. Defaults to an empty list.

property name: str

Return the function name as a concatenated string.

Returns

str

The function name as a single string.

class eptalights_code.models.sophia_ir.callsite.CallsiteManagerModel(*, step_callsites: Dict[int, str] = {}, unique_callsites: Dict[str, List[str]] = {}, callsites: Dict[str, CallsiteModel] = {})

Manages function call sites.

Attributes

step_callsitesdict[int, str], optional

A mapping from step indices to Callsite. Defaults to an empty dictionary.

unique_callsitesdict[str, List[str]], optional

A mapping from function names to lists of their SSA function names. Defaults to an empty dictionary.

callsitesdict[str, CallsiteModel], optional

A mapping from SSA variable names to their corresponding CallsiteModel instances. Defaults to an empty dictionary.

all(name: str | None = None) List[CallsiteModel]

Retrieve all call sites or those matching a specific function name.

Parameters

namestr, optional

The function name to filter by. If None, all call sites are returned.

Returns

list[CallsiteModel]

A list of matching CallsiteModel instances.

at_step(step_index: int) CallsiteModel | None

Retrieve a call site at a given step index.

Parameters

step_indexint

The index of the step.

Returns

CallsiteModel or None

The CallsiteModel instance if found, otherwise None.

by_ssa_name(ssa_name: str) CallsiteModel | None

Retrieve a call site model by its SSA function name.

Parameters

ssa_namestr

The SSA variable name of the call site.

Returns

CallsiteModel or None

The corresponding CallsiteModel instance if found, otherwise None.

property names: List[str]

Return a list of unique function names found in call sites.

Returns

list[str]

A list of function names.

search(name: str | None = None) List[CallsiteModel]

Search for call sites whose function names contain a given substring.

Parameters

namestr, optional

The substring to search for in function names.

Returns

list[CallsiteModel]

A list of matching CallsiteModel instances.

property ssa_names: List[str]

Return a list of all SSA variable names used in function calls.

Returns

list[str]

A list of SSA variable names.

property steps: List[int]

Return a list of step indices where function calls occur.

Returns

list[int]

A list of step indices.

CFG

class eptalights_code.models.sophia_ir.cfg.ControlFlowGraphModel(*, basicblock_exit_nodes: List[int] = [], basicblock_steps: Dict[int, List[int]] = {}, basicblock_edges: Dict[int, List[int]] = {})

Represents a control flow graph (CFG) model.

Attributes

basicblock_exit_nodesList[int], optional

A list of basic block exit nodes. Defaults to an empty list.

basicblock_stepsDict[int, List[int]], optional

A mapping of basic block indices to their corresponding step indices. Defaults to an empty dictionary.

basicblock_edgesDict[int, List[int]], optional

A mapping of basic block indices to their successor basic blocks. Defaults to an empty dictionary.

Config

class eptalights_code.models.sophia_ir.config.ConfigModel(*, project_id: str | None = None, extractor_output_path: str, code_type: str, storage_backend: str | None = 'sqlite3', local_database_path: str | None = None, output_decompiled_path: str | None = './__eptalights_decompiled_code/')

Represents the configuration settings for a code analysis process.

Attributes

project_idstr, optional

The unique identifier for the project. Defaults to None.

extractor_output_pathstr

The path to the extractor’s output directory.

code_typestr

The type of code being processed (e.g., source, intermediate representation).

storage_backendstr, optional

The storage backend used for extracted data. Defaults to sqLite3, with support for additional file-based databases planned.

local_database_pathstr, optional

The path to the database storing extracted information. Defaults to None.

output_decompiled_pathstr, optional

The destination path for storing decompiled code. Defaults to “./__eptalights_decompiled_code/”.

Variable

class eptalights_code.models.sophia_ir.variable.SSAVariableModel(*, ssa_name: str | None = None, ssa_version: int = 0, variable_name: str | None = None, variable_defined_at_steps: list[int] = [], variable_used_at_steps: list[int] = [], variable_used_in_callsites: list[str] = [], record_attributes_defined_at_steps: Dict[int, List[str]] = {}, record_attributes_used_at_steps: Dict[int, List[str]] = {}, used_inside_other_tokenized_operand_tokens_at_step: Dict[int, List[str]] | None = {}, tokenized_operands_defs_at_steps: Dict[int, List[TokenizedOperandModel]] = {}, tokenized_operands_uses_at_steps: Dict[int, List[TokenizedOperandModel]] = {})

Represents an SSA (Static Single Assignment) variable in program analysis.

Attributes

ssa_namestr, optional

The SSA name of the variable, if available. Defaults to None.

ssa_versionint, optional

The SSA version of the variable. Defaults to 0.

variable_namestr, optional

The original variable name before SSA transformation. Defaults to None.

variable_defined_at_stepslist of int, optional

A list of step indices where the variable is defined. Defaults to an empty list.

variable_used_at_stepslist of int, optional

A list of step indices where the variable is used. Defaults to an empty list.

variable_used_in_callsiteslist of str, optional

A list of function call sites where this variable is used. Defaults to an empty list.

record_attributes_defined_at_stepsdict of {int: list of str}, optional

A mapping of step indices to lists of record attributes defined at each step. Defaults to an empty dictionary.

record_attributes_used_at_stepsdict of {int: list of str}, optional

A mapping of step indices to lists of record attributes used at each step. Defaults to an empty dictionary.

used_inside_other_tokenized_operand_tokens_at_step :

dict of {int: list of str}, optional A mapping of step indices to lists of tokenized operand tokens in which this variable is used. Defaults to an empty dictionary.

tokenized_operands_defs_at_steps :

dict of {int: list of TokenizedOperandModel}, optional A mapping of step indices to lists of tokenized operand definitions associated with this variable. Defaults to an empty dictionary.

tokenized_operands_uses_at_steps :

dict of {int: list of TokenizedOperandModel}, optional A mapping of step indices to lists of tokenized operand uses associated with this variable. Defaults to an empty dictionary.

class eptalights_code.models.sophia_ir.variable.VariableModel(*, vid: str | None = None, name: str, vartype: VarType = VarType.UNDEF, unique_ssa_variables: Dict[str, SSAVariableModel] = {}, full_declaration: str | None = None, type_declaration: str | None = None, type_props: List[str] | None = [], tokenized_type_declaration: TokenizedOperandModel | None = None, additional_info: Dict[str, Any] = {}, phi_ssa_variables: Dict[str, List[str]] = {})

Represents a variable in program analysis, including SSA details.

Attributes

vidstr, optional

A unique identifier for the variable in the format filename:function_name:variable_name. Defaults to None.

namestr

The name of the variable.

vartypeVarType, optional

The type of the variable, defaulting to VarType.UNDEF.

unique_ssa_variablesdict[str, SSAVariableModel], optional

A mapping of SSA variable names to their corresponding SSAVariableModel instances. Defaults to an empty dictionary.

full_declarationstr, optional

The full declaration of the variable, if available.

type_declarationstr, optional

The type declaration of the variable, if available.

type_propslist[str], optional

A list of additional type properties. Defaults to an empty list.

tokenized_type_declarationTokenizedOperandModel, optional

A tokenized representation of the variable’s type declaration.

additional_infodict[str, Any], optional

Additional metadata about the variable.

phi_ssa_variablesdict[str, list[str]], optional

A mapping of SSA variable names to lists of SSA versions used in phi functions. Defaults to an empty dictionary.

decompile()

Generate a human-readable or high-level representation.

Returns

str

A string representation of the expression.

property defined_at_steps: List[int]

Get the unique execution steps where the variable is defined.

Returns

list[int]

A list of unique step indices where this variable is defined.

get_ssa_var(ssa_name: str) SSAVariableModel | None

Retrieve a specific SSA version of the variable.

Parameters

ssa_namestr

The name of the SSA variable to retrieve.

Returns

SSAVariableModel or None

The SSAVariableModel instance corresponding to ssa_name, or None if not found.

property ssa_variables: List[SSAVariableModel]

Get all SSA versions of the variable.

Returns

list[SSAVariableModel]

A list of SSAVariableModel instances representing SSA versions of this variable.

property used_at_steps: List[int]

Get the unique execution steps where the variable is used.

Returns

list[int]

A list of unique step indices where this variable is used.

class eptalights_code.models.sophia_ir.variable.VariableManagerModel(*, function_args: List[str] = [], local_variables: List[str] = [], tmp_variables: List[str] = [], return_variables: List[str] = [], variables: Dict[str, VariableModel] = {})
Manages variables within a function’s scope, tracking

their usage and definitions.

Attributes

function_argsList[str], optional

A list of function argument names. Defaults to an empty list.

local_variablesList[str], optional

A list of local variable names. Defaults to an empty list.

tmp_variablesList[str], optional

A list of temporary variable names. Defaults to an empty list.

return_variablesList[str], optional

A list of variables representing function return values. Defaults to an empty list.

variablesDict[str, VariableModel], optional

A dictionary mapping variable names to their corresponding VariableModel instances. Defaults to an empty dictionary.

all() List[VariableModel]

Retrieve all stored variables.

Returns

List[VariableModel]

A list of all VariableModel instances stored in variables.

defined_at_step(step_index: int) List[VariableModel]

Retrieve variables that are defined at a given step.

Parameters

step_indexint

The step index to check for variable definitions.

Returns

List[VariableModel]

A list of VariableModel instances that were defined at the given step.

get(name: str) VariableModel

Retrieve a variable by its name.

Parameters

namestr

The name of the variable to retrieve.

Returns

VariableModel or None

The corresponding VariableModel instance if found, otherwise None.

property names: List[str]

Get the list of all variable names.

Returns

List[str]

A list of all variable names stored in variables.

search(name: str | None = None) List[VariableModel]

Search for variables whose names contain a given substring.

Parameters

namestr, optional

The substring to search for in variable names.

Returns

list[VariableModel]

A list of matching VariableModel instances.

property ssa_names: List[str]

Get the list of all unique SSA (Static Single Assignment) variable names.

Returns

List[str]

A list of unique SSA variable names across all stored variables.

used_at_step(step_index: int) List[VariableModel]

Retrieve variables that are used at a given step.

Parameters

step_indexint

The step index to check for variable usage.

Returns

List[VariableModel]

A list of VariableModel instances that were used at the given step.

used_or_defined_at_step(step_index: int) List[VariableModel]

Retrieve variables that are either used or defined at a given step.

Parameters

step_indexint

The step index to check for variable usage or definition.

Returns

List[VariableModel]

A list of VariableModel instances used or defined at the given step.

Tokenized Operands

class eptalights_code.models.sophia_ir.tokenized_operand.TokenizedOperandModel(*, operand_type: TokenType | None = TokenType.IS_UNDEF, ssa_name: str | None = None, ssa_version: Annotated[int, Strict(strict=True)] = 0, variable_name: str | None = None, step_index: Annotated[int, Strict(strict=True)] | None = None, position: Annotated[int, Strict(strict=True)] = 0, used_inside_other_tokenized_operand_tokens_at_step: Dict[Annotated[int, Strict(strict=True)], List[str]] = {}, current_depth_position: Annotated[int, Strict(strict=True)] = 0, tokens: List[TokenModel] = [])

Represents a tokenized operand used within a specific step of program analysis.

Attributes

operand_typeTokenType, optional

The type of operand. Defaults to TokenType.IS_UNDEF.

ssa_namestr, optional

The SSA name for the operand.

ssa_versionint

The version of the SSA name. Defaults to 0.

variable_namestr, optional

The variable name for the operand.

step_indexint, optional

The index of the step where the operand is used.

positionint

The position of the operand within a specific context. Defaults to 0.

used_inside_other_tokenized_operand_tokens_at_stepdict[int, list[str]], optional

A dictionary where keys represent step indices and values are lists of tokens used in other tokenized operands at those steps. Defaults to an empty dictionary.

current_depth_positionint

The current depth position of the operand. Defaults to 0.

tokenslist[TokenModel]

A list of tokens associated with the operand.

_debug_visited_nodeslist[str], optional

For debugging purposes, tracks visited nodes during analysis. Defaults to an empty list.

array_index_token_at_index(idx: int) TokenModel

Retrieves the array index token at a specified index.

Parameters

idxint

The index of the array token to retrieve.

Returns

TokenModel

The token at the specified index, or an empty TokenModel if not found.

array_index_token_values_iter() List[TokenModel]

Yields the values of tokens representing array indices.

Yields

str

The value of the token representing an array index.

array_index_tokens_iter() List[TokenModel]

Yields the tokens representing array indices.

Yields

TokenModel

The token representing an array index.

constant_index_tokens_iter() List[TokenModel]

Yields tokens that represent constants.

Yields

TokenModel

The token representing a constant.

constant_token_at_index(idx: int) TokenModel

Retrieves the constant token at a specified index.

Parameters

idxint

The index of the constant token to retrieve.

Returns

TokenModel

The token at the specified index, or an empty TokenModel if not found.

decompile()

Generate a human-readable or high-level representation.

Returns

str

A string representation of the expression.

get_field_attributes_used_in_tokens() List[str]

Extracts and returns the attribute values used in the tokens.

Returns

list[str]

A list of attribute values from tokens with type TokenType.IS_ATTRIBUTE.

get_total_array_index_tokens() int

Returns the total number of array index tokens.

Returns

int

The total number of array index tokens.

has_constant_in_tokens() bool

Checks if any token in the operand is a constant.

Returns

bool

True if a constant token is found, otherwise False.

has_ssa_variable_extracted() bool

Checks whether the operand has an SSA variable extracted.

Returns

bool

True if both variable_name and ssa_name are not None, otherwise False.

model_post_init(context: Any, /) None

This function is meant to behave like a BaseModel method to initialise private attributes.

It takes context as an argument since that’s what pydantic-core passes when calling it.

Args:

self: The BaseModel instance. context: The context.

pretty_print_tokens()

Prints a pretty representation of the tokens.

This function converts the tokens to a dictionary format and prints them for debugging purposes.

symbol_index_tokens_iter() List[TokenModel]

Yields tokens that represent symbols.

Yields

TokenModel

The token representing a symbol.

symbol_token_at_index(idx: int) TokenModel

Retrieves the symbol token at a specified index.

Parameters

idxint

The index of the symbol token to retrieve.

Returns

TokenModel

The token at the specified index, or an empty TokenModel if not found.

class eptalights_code.models.sophia_ir.tokenized_operand.TokenModel(*, token_type: TokenType = TokenType.IS_UNDEF, is_base_variable: bool = False, code_name: str | None = None, value: str | None = None, value_extended: str | None = None, discovery_depth: int = 0)

Represents a token model with attributes holding metadata.

Attributes

model_configdict

Configuration dictionary for model validation settings, such as enabling assignment validation.

token_typeTokenType

The type of the token, defaults to TokenType.IS_UNDEF.

is_base_variablebool

Flag indicating whether the token is a base variable, defaults to False.

code_namestr, optional

An optional string representing the code name of the token.

valuestr, optional

An optional string representing the token’s value.

value_extendedstr, optional

An optional extended value of the token.

discovery_depthint

The depth of discovery for the token, defaults to 0.

Enum Types

class eptalights_code.models.sophia_ir.enum_types.VarType(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)

Represents the type of a variable in program analysis.

Attributes

FUNCTION_ARGUMENT

Represents a function argument.

LOCAL_VARIABLE

Represents a local variable.

TMP_VARIABLE

Represents a temporary variable.

GLOBAL_VARIABLE

Represents a global variable.

UNDEF

Represents an undefined variable type.

class eptalights_code.models.sophia_ir.enum_types.TokenType(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)

An enumeration representing the token type of an element.

Attributes

IS_UNDEF

Represents an undefined token type.

IS_VARIABLE

Represents a variable token type. Example: Array access used with a variable index, e.g., a[n] where a and n are variables.

IS_CONSTANT

Represents a constant token type. Example: Array access used with a constant index, e.g., a[10].

IS_SYMBOL

Represents a symbol token type. Example: Symbols like [, ], *, ., ->, &, etc.

IS_ATTRIBUTE

Represents an attribute token type. Example: Struct field access, e.g., st.data, st->data.

IS_TYPE

Represents a type token type. Example: Type declaration, e.g., struct FILE fp.

IS_FUNCTION

Represents a function token type. Example: Functions passed as arguments to calls, e.g., select_files and alphasort.

Example Usage:

int select_files(const struct dirent *dirbuf)
{
    if (dirbuf->d_name[0] == '.')
        return 0;
    else
        return 1;
}

int alphasort(const struct dirent **a, const struct dirent **b)
{
    return (strcmp((*a)->d_name, (*b)->d_name));
}

int scandir(
    const char *dir,
    struct dirent ***namelist,
    int (*select) (const struct dirent *),
    int (*compar) (const struct dirent **, const struct dirent **)
)
IS_VARIABLE_AND_IS_FUNCTION

Represents both a variable and a function token type.

class eptalights_code.models.sophia_ir.enum_types.OpType(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)

Enumeration of operation types.

Attributes

NOP

No operation.

ASSIGN

Assignment operation.

CALL

Function or method call.

RETURN

Return from a function or method.

COND

Conditional operation (e.g., if-else).

GOTO

Unconditional jump to another location.

SWITCH

Switch-case operation.

LABEL

Label for jump or branch operations.

class eptalights_code.models.sophia_ir.enum_types.ExprType(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)

Numeration of expression types.

Attributes

NO_EXPR

No expression.

MULT_EXPR

Multiplication expression (lhs * rhs).

PLUS_EXPR

Addition expression (lhs + rhs).

MINUS_EXPR

Subtraction expression (lhs - rhs).

RDIV_EXPR

Right division expression (lhs / rhs).

DIV_EXPR

Division expression (lhs / rhs).

MOD_EXPR

Modulo expression (lhs % rhs).

GREATER_THAN_OR_EQUAL_EXPR

Greater than or equal to expression (lhs >= rhs).

GREATER_THAN_EXPR

Greater than expression (lhs > rhs).

LESS_THAN_EXPR

Less than expression (lhs < rhs).

LESS_THAN_OR_EQUAL_EXPR

Less than or equal to expression (lhs <= rhs).

EQUAL_EXPR

Equal to expression (lhs == rhs).

NOT_EQUAL_EXPR

Not equal to expression (lhs != rhs).

BITWISE_AND_EXPR

Bitwise AND expression (lhs && rhs).

BITWISE_EXCLUSIVE_OR_EXPR

Bitwise exclusive OR expression (lhs & rhs).

BITWISE_INCLUSIVE_OR_EXPR

Bitwise inclusive OR expression (lhs | rhs).

BITWISE_NOT_EXPR

Bitwise NOT expression (~lhs).

TRUNC_DIV_EXPR

Truncated division expression (lhs / rhs).

TRUNC_MOD_EXPR

Truncated modulo expression (lhs % rhs).

LSHIFT_EXPR

Left shift expression (lhs << rhs).

RSHIFT_EXPR

Right shift expression (lhs >> rhs).

RROTATE_EXPR

Right rotate expression ((((x) << (b)) | ((x) >> (32 - (b))))).

NEGATE_EXPR

Negate expression (~lhs).

MIN_EXPR

Minimum expression ((lhs < rhs)).

MAX_EXPR

Maximum expression ((lhs > rhs)).

POINTER_PLUS_EXPR

Pointer plus expression (lhs + rhs). This node represents pointer arithmetic. The first operand is always a pointer/reference type. The second operand is always an unsigned integer type compatible with sizetype. This is the only binary arithmetic operand that can operate on pointer types.

FIX_TRUNC_EXPR

Fix trunc expression (conversion of floating-point value to an integer). These nodes represent conversion of a floating-point value to an integer. The single operand will have a floating-point type, while the complete expression will have an integral (or boolean) type. The operand is rounded towards zero.

dst[bool, int] = (floating-point type)rhs

REALPART_EXPR

TODO

IMAGPART_EXPR

TODO

ABS_EXPR

TODO

ABSU_EXPR

These nodes represent the absolute value of the single operand in equivalent unsigned type such that ABSU_EXPR of TYPE_MIN is well defined.

SPACESHIP_EXPR

Maximum expression ((lhs <=> rhs)).

UNDEF

Undefined expression.

decompile()

Generate a human-readable or high-level representation.

Returns

str

A string representation of the expression.

File Metadata —

class eptalights_code.models.sophia_ir.file_metadata.ClassMetadataModel(*, class_props: Dict[str, TokenizedOperandModel] = {}, class_methods: Dict[str, str] = {})

Represents metadata for a class within a program analysis context.

Attributes

class_propsDict[str, TokenizedOperandModel]

A dictionary mapping property names to their tokenized representations. Represents the attributes (fields) of the class.

class_methodsDict[str, str]

A dictionary mapping method names to their unique identifiers.

class eptalights_code.models.sophia_ir.file_metadata.FileMetadataModel(*, filepath: str, classes: Dict[str, ClassMetadataModel] = {}, functions: Dict[str, str] = {})

Represents metadata for a file within a program analysis context.

Attributes

filepathstr

The path to the file being analyzed.

classesDict[str, ClassMetadataModel]

A dictionary mapping class names to their metadata models.

functionsDict[str, str]

A dictionary mapping function names to their unique identifiers.

File Data —

class eptalights_code.models.sophia_ir.file_data.ClassDataModel(*, class_props: Dict[str, TokenizedOperandModel] = {}, class_methods: Dict[str, FunctionModel] = {})

Represents a class within a program analysis context.

Attributes

class_propsDict[str, TokenizedOperandModel]

A dictionary mapping property names to their tokenized representations. Represents the attributes (fields) of the class.

class_methodsDict[str, FunctionModel]

A dictionary mapping method names to their corresponding function models. Represents the functions (methods) of the class.

class eptalights_code.models.sophia_ir.file_data.FileDataModel(*, filepath: str, classes: Dict[str, ClassDataModel] = {}, functions: Dict[str, FunctionModel] = {})

Represents a file within a program analysis context.

Attributes

filepathstr

The path to the file being analyzed.

classesDict[str, ClassDataModel]

A dictionary mapping class names to their corresponding class models.

functionsDict[str, FunctionModel]

A dictionary mapping function names to their corresponding function models.

decompile() str

Generates a human-readable or high-level representation of the file.

Returns

str

A string representation of the decompiled file.

Dataflow

class eptalights_code.models.sophia_ir.dataflow.SinkResultType(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)

Enumeration representing possible sink result types.

Attributes

CONTINUESinkResultType

Indicates that the process should continue.

OKSinkResultType

Indicates a successful result.

STOPSinkResultType

Indicates that the process should stop.

class eptalights_code.models.sophia_ir.dataflow.DataflowEventModel(*, op: OpType, lineno: int, variable_name: str, ssa_variable_name: str, ssa_version: int, data_direction: str | None = None, var_depth_pos: int, step_index: int, record_attributes_defined_here: List[str] | None = [], record_attributes_used_here: List[str] | None = [], used_inside_other_tokenized_operand_tokens_here: bool = False, current_record_attibute_tracked: str | None = None)

Represents a dataflow event in the dataflow analysis.

Attributes

opOpType

The operation type associated with this dataflow event.

linenoint

The line number where this dataflow event occurs.

variable_namestr

The name of the variable involved in this event.

ssa_variable_namestr

The SSA (Static Single Assignment) name of the variable.

ssa_versionint

The SSA version number of the variable.

data_directionstr, optional

The direction of data flow (e.g., “read” or “write”). Defaults to None.

var_depth_posint

Internal property for debugging variable position.

step_indexint

The step index of the program execution where this event occurs.

record_attributes_defined_hereList[str], optional

A list of record attributes that are defined at this event. Defaults to an empty list.

record_attributes_used_hereList[str], optional

A list of record attributes that are used at this event. Defaults to an empty list.

used_inside_other_tokenized_operand_tokens_herebool, optional

Indicates whether this variable is used inside other tokenized operand tokens. Defaults to False.

current_record_attibute_trackedstr, optional

Current record attribute being tracked. Defaults to None.

class eptalights_code.models.sophia_ir.dataflow.DataflowStateModel(*, current_event: DataflowEventModel, current_function: FunctionModel, current_step: SophiaIRNopModel | SophiaIRAssignModel | SophiaIRCallModel | SophiaIRCondModel | SophiaIRReturnModel | SophiaIRGotoModel | SophiaIRSwitchModel, previous_events: List[DataflowEventModel] = [])

Represents the state of data flow analysis at a specific point in execution.

Attributes

current_eventDataflowEventModel

The current data flow event being processed.

current_functionFunctionModel

The function in which the current event is occurring.

current_stepUnion[

function_model.SophiaIRNopModel, function_model.SophiaIRAssignModel, function_model.SophiaIRCallModel, function_model.SophiaIRCondModel, function_model.SophiaIRReturnModel, function_model.SophiaIRGotoModel, function_model.SophiaIRSwitchModel

]

The current step in the execution, represented by one of the SOPHIA IR models.

previous_eventsList[DataflowEventModel], optional

A list of previous data flow events leading up to the current state. Defaults to an empty list.

class eptalights_code.models.sophia_ir.dataflow.DataflowPathModel(*, events: List[DataflowEventModel] = [], passthru_callsites: List[str] = [], data_mutation_count: int = 0)

Represents a sequence of data flow events in a program analysis context.

Attributes

eventsList[DataflowEventModel], optional

A list of data flow events that constitute this path. Defaults to an empty list.

passthru_callsitesList[str], optional

A list of function call sites that the data flow passes through. Defaults to an empty list.

data_mutation_countint

The number of times data was mutated. Defaults to 0.

class eptalights_code.models.sophia_ir.dataflow.DataflowRequestModel(*, function: FunctionModel, source_variable_name: str, sink_callback_fn: Callable[[DataflowStateModel], SinkResultType], start_from_step_index: int | None = None, timeout_secs: int = 180, strict_record_attributes_tracking: bool = True)

Represents a request for data flow analysis.

Attributes

source_variable_namestr

The name of the variable that serves as a data flow source.

start_from_step_indexint, optional

The index of the step from which the data flow source starts. If not provided, it defaults to None.

sink_callback_fnCallable[[DataflowStateModel], SinkResultType]

A function that processes a DataflowStateModel and returns a SinkResultType. This function represents the sink in the data flow analysis.

Example:

def reachability_to_malloc_sink(
    state: DataflowStateModel
) -> SinkResultType:
    if (
        state.current_event.op == models.OpType.CALL
        and state.current_step.fname == "malloc"
    ):
        return models.SinkResultType.OK
    return models.SinkResultType.CONTINUE
functionFunctionModel

The function model representing the target function for data flow analysis.

timeout_secsint, optional

The maximum time (in seconds) allowed for the analysis. Defaults to 180 seconds (3 minutes). Maximum is 600 seconds (10 minutes).

strict_record_attributes_trackingbool

If True, strictly tracks data within complex records or variables, following only specific attributes rather than the base variable.

Example:

# Given the following code:

a.attr = 10
x = a
print(a.something_else)

# If `strict_record_attributes_tracking` is False, the data flow path will
# include `print(a.something_else)`, since it tracks `a` as a whole.

# However, if `strict_record_attributes_tracking` is True, the data flow
# path will **not** include `print(a.something_else)`, as it focuses only
# on `a.attr`.

# Another case:

a.attr = 10
x = a
print(a.attr)

# Here, the `print(a.attr)` statement **will** be included in the data
# flow path, as it matches the tracked attribute.

# Using strict attribute tracking is encouraged for accuracy, unless
# attributes cause side effects or a different behavior is required.
class eptalights_code.models.sophia_ir.dataflow.DataflowResponseModel(*, status: bool, paths: List[DataflowPathModel] = [], error_message: str | None = None)

Represents the response model for a dataflow analysis request.

Attributes

statusbool

Indicates whether the dataflow analysis was successful.

pathsList[DataflowPathModel], optional

A list of dataflow paths resulting from the analysis. Defaults to an empty list.

error_messageOptional[str], optional

An error message if the analysis failed. Defaults to None.

class eptalights_code.models.sophia_ir.dataflow.DataflowActionModel(*, action_id: Annotated[UUID, UuidVersion(uuid_version=4)], request_hash: str, status: DataflowActionStatusType, request: DataflowRequestModel, response: DataflowResponseModel | None = None, data_created: str)

Represents the model for a dataflow action request and its response.

Attributes

action_idUUID4

Unique identifier for the dataflow action.

request_hashstr

Hash value of the request payload, used for deduplication or caching.

requestDataflowRequestModel

The original request model containing parameters for the dataflow action.

responseOptional[DataflowResponseModel], optional

The response model containing the results of the dataflow action. Defaults to None if no response is available yet.

data_createdstr

Timestamp indicating when the dataflow action was created.