API Reference
Function
- class eptalights_code.models.sophia_ir.function.FunctionModel(*, fid: str, name: str, filepath: str, class_name: str | None = None, class_props: Dict[str, TokenizedOperandModel] = {}, variable_manager: VariableManagerModel | None = VariableManagerModel(function_args=[], local_variables=[], tmp_variables=[], return_variables=[], variables={}), callsite_manager: CallsiteManagerModel | None = CallsiteManagerModel(step_callsites={}, unique_callsites={}, callsites={}), cfg: ControlFlowGraphModel | None = ControlFlowGraphModel(basicblock_exit_nodes=[], basicblock_steps={}, basicblock_edges={}), steps: List[SophiaIRNopModel | SophiaIRLabelModel | SophiaIRAssignModel | SophiaIRCallModel | SophiaIRCondModel | SophiaIRReturnModel | SophiaIRGotoModel | SophiaIRSwitchModel] = [])
Represents a function within a program analysis context.
Attributes
- fidstr
A unique identifier for the function.
- namestr
The name of the function.
- filepathstr
The file path where the function is defined.
- class_nameOptional[str], optional
The name of the class containing this function (if applicable). Defaults to None.
- class_propsDict[str, TokenizedOperandModel], optional
A dictionary mapping class property names to their tokenized representations. Defaults to an empty dictionary.
- variable_managerOptional[VariableManagerModel], optional
An instance managing the variables within the function scope. Defaults to an empty VariableManagerModel.
- callsite_managerOptional[CallsiteManagerModel], optional
An instance managing the function’s call sites. Defaults to an empty CallsiteManagerModel.
- cfgOptional[ControlFlowGraphModel], optional
The control flow graph representing the function’s execution flow. Defaults to an empty ControlFlowGraphModel.
- stepsList[Union[SophiaIRNopModel, SophiaIRLabelModel,
SophiaIRAssignModel, SophiaIRCallModel, SophiaIRCondModel, SophiaIRReturnModel, SophiaIRGotoModel, SophiaIRSwitchModel]]
A list of steps representing the function’s operations in the intermediate representation (IR). Each step corresponds to an operation such as assignment, call, condition, return, etc. Defaults to an empty list.
- decompile()
Generate a human-readable or high-level representation.
Returns
- str
A string representation of the expression.
- classmethod set_steps(v)
Validate and convert steps into their respective IR models.
Parameters
- vlist
A list of step dictionaries containing operation details.
Returns
- list
A list of instantiated IR step models corresponding to the operation types.
Raises
- Exception
If an unknown operation type is encountered.
Step or Instruction
- class eptalights_code.models.sophia_ir.function.SophiaIRNopModel(*, step_index: int | None = None, lineno: int = -1, basicblock_index: int | None = None, low_level_steps: List[int] = [], variables_defined_here: List[str] = [], variables_used_here: List[str] = [], ssa_variables_defined_here: List[str] = [], ssa_variables_used_here: List[str] = [], op: OpType = OpType.NOP)
Represents a NOP (No Operation) instruction in the SOPHIA IR model.
Attributes
- opOpType, optional
The operation type associated with this instruction. Defaults to OpType.NOP.
- decompile()
Generate a human-readable or high-level representation.
Returns
- str
A string representation of the expression.
- update_variables_defined_and_used_here()
Update the variables defined and used in this NOP instruction.
This method is currently a placeholder and does not modify any state. It is intended to be overridden or extended in future implementations to handle variable tracking specific to NOP operations.
- class eptalights_code.models.sophia_ir.function.SophiaIRAssignModel(*, step_index: int | None = None, lineno: int = -1, basicblock_index: int | None = None, low_level_steps: List[int] = [], variables_defined_here: List[str] = [], variables_used_here: List[str] = [], ssa_variables_defined_here: List[str] = [], ssa_variables_used_here: List[str] = [], op: OpType = OpType.ASSIGN, src: ExprModel, dst: TokenizedOperandModel)
Represents an assignment operation in the SOPHIA IR.
Attributes
- opOpType, optional
The operation type, which defaults to OpType.ASSIGN.
- srcExprModel
The source expression being assigned.
- dstTokenizedOperandModel
The destination operand that receives the assignment.
- decompile()
Generate a human-readable or high-level representation.
Returns
- str
A string representation of the expression.
- property defined_tokenized_operands: List[TokenizedOperandModel]
Retrieve the list of operands that are defined in this assignment.
Returns
- List[TokenizedOperandModel]
A list containing the destination operand.
- update_operand_step_index() None
Update the step index for all operands involved in this assignment.
This ensures that the step index of the destination and source operands aligns with the step index of the assignment operation.
- update_variables_defined_and_used_here() None
Update the lists of variables defined and used in this assignment.
This method categorizes tokens into ssa_variables_defined_here and ssa_variables_used_here based on whether they are defined or used in this assignment operation.
- property used_tokenized_operands: List[TokenizedOperandModel]
Retrieve the list of operands used in this assignment.
Returns
- List[TokenizedOperandModel]
A list of operands used in the right-hand side of the assignment.
- class eptalights_code.models.sophia_ir.function.SophiaIRCallModel(*, step_index: int | None = None, lineno: int = -1, basicblock_index: int | None = None, low_level_steps: List[int] = [], variables_defined_here: List[str] = [], variables_used_here: List[str] = [], ssa_variables_defined_here: List[str] = [], ssa_variables_used_here: List[str] = [], op: OpType = OpType.CALL, fname: str, fname_tokenized: TokenizedOperandModel | None = None, finfo: str | None = None, fargs: List[TokenizedOperandModel] = [], dst: TokenizedOperandModel | None = None)
Represents a function call operation in the SOPHIA IR.
Attributes
- opOpType
The operation type, which is always set to OpType.CALL.
- fnamestr
The name of the function being called.
- fname_tokenizedOptional[TokenizedOperandModel], optional
A tokenized representation of the function name, if available. Defaults to None.
- finfoOptional[str], optional
Additional function information, if available. Defaults to None.
- fargsList[TokenizedOperandModel], optional
A list of tokenized operands representing the function arguments. Defaults to an empty list.
- dstOptional[TokenizedOperandModel], optional
The destination operand where the function’s return value is stored. Defaults to None.
- decompile()
Generate a human-readable or high-level representation.
Returns
- str
A string representation of the expression.
- property defined_tokenized_operands: List[TokenizedOperandModel]
Return a list of tokenized operands defined by this call operation.
Returns
- List[TokenizedOperandModel]
A list containing the destination operand, if defined.
- update_operand_step_index() None
Update the step index of all function arguments to match the current step index.
- update_variables_defined_and_used_here() None
Update the lists of defined and used SSA variables based on the function call.
This method populates ssa_variables_defined_here and variables_defined_here for newly defined variables, and ssa_variables_used_here and variables_used_here for referenced variables.
- property used_tokenized_operands: List[TokenizedOperandModel]
Return a list of tokenized operands used as arguments in the call.
Returns
- List[TokenizedOperandModel]
A list of operands used in the function call, filtering only those that represent variables.
- class eptalights_code.models.sophia_ir.function.SophiaIRCondModel(*, step_index: int | None = None, lineno: int = -1, basicblock_index: int | None = None, low_level_steps: List[int] = [], variables_defined_here: List[str] = [], variables_used_here: List[str] = [], ssa_variables_defined_here: List[str] = [], ssa_variables_used_here: List[str] = [], op: OpType = OpType.COND, src: ExprModel, true_dst_block_index: int | None = None, false_dst_block_index: int | None = None)
Represents a conditional expression in the SOPHIA IR.
Attributes
- opOpType
The operation type, set to
OpType.CONDby default.- srcExprModel
The conditional expression being evaluated.
- true_dst_block_indexOptional[int], optional
The index of the block to jump to if the condition is
true. Defaults toNone.- false_dst_block_indexOptional[int], optional
The index of the block to jump to if the condition is
false. Defaults toNone.
- decompile()
Generate a human-readable or high-level representation.
Returns
- str
A string representation of the expression.
- property defined_tokenized_operands: list
Return an empty list since conditional expressions do not define new operands.
Returns
- list
An empty list.
- update_operand_step_index() None
Update the step index for operands in the conditional expression.
This method sets the step_index of the left-hand side (lhs) operand and, if present, the right-hand side (rhs) operand.
- update_variables_defined_and_used_here() None
Update the lists of SSA variables and regular variables used in this condition.
This method ensures that variables appearing in the conditional expression are tracked in ssa_variables_used_here and variables_used_here, avoiding duplicates.
- class eptalights_code.models.sophia_ir.function.SophiaIRReturnModel(*, step_index: int | None = None, lineno: int = -1, basicblock_index: int | None = None, low_level_steps: List[int] = [], variables_defined_here: List[str] = [], variables_used_here: List[str] = [], ssa_variables_defined_here: List[str] = [], ssa_variables_used_here: List[str] = [], op: OpType = OpType.RETURN, dst: TokenizedOperandModel | None = None)
Represents a RETURN operation in the SOPHIA IR.
Attributes
- opOpType
The operation type, set to OpType.RETURN by default.
- dstOptional[TokenizedOperandModel], optional
The destination operand representing the return value. Defaults to None.
- decompile()
Generate a human-readable or high-level representation.
Returns
- str
A string representation of the expression.
- property defined_tokenized_operands
Retrieve the operands defined by this RETURN operation.
Returns
- list
An empty list, as RETURN does not define any new variables.
- update_operand_step_index()
Update the step index for the return operand.
- update_variables_defined_and_used_here()
Update the lists of SSA and regular variables used at this RETURN operation.
- class eptalights_code.models.sophia_ir.function.SophiaIRGotoModel(*, step_index: int | None = None, lineno: int = -1, basicblock_index: int | None = None, low_level_steps: List[int] = [], variables_defined_here: List[str] = [], variables_used_here: List[str] = [], ssa_variables_defined_here: List[str] = [], ssa_variables_used_here: List[str] = [], op: OpType = OpType.GOTO, dst: TokenizedOperandModel | None = None, goto_label_names: List[str] = [], goto_basic_blocks: List[int] = [])
Represents a SOPHIA IR ‘goto’ statement.
Attributes
- opOpType
The operation type, set to OpType.GOTO by default.
- dstTokenizedOperandModel, optional
The destination operand for the goto statement. Defaults to None.
- goto_label_namesList[str], optional
A list of possible label names the goto statement may jump to. Defaults to an empty list.
- goto_basic_blocksList[int], optional
A list of possible basic block indices the goto statement may target. Defaults to an empty list.
Notes
If the goto statement targets a variable (e.g., goto varname), it may resolve to multiple labels, similar to a switch statement.
If it targets a constant (e.g., goto constant), it resolves to a single static label.
- decompile()
Generate a human-readable or high-level representation.
Returns
- str
A string representation of the expression.
- property defined_tokenized_operands: list
Return an empty list, as ‘goto’ does not define any operands.
Returns
- list
An empty list.
- class eptalights_code.models.sophia_ir.function.SophiaIRSwitchModel(*, step_index: int | None = None, lineno: int = -1, basicblock_index: int | None = None, low_level_steps: List[int] = [], variables_defined_here: List[str] = [], variables_used_here: List[str] = [], ssa_variables_defined_here: List[str] = [], ssa_variables_used_here: List[str] = [], op: OpType = OpType.SWITCH, switch_index: TokenizedOperandModel, switch_cases: List[TokenizedOperandModel] | None = [], switch_label_names: List[str] | None = [], switch_basic_blocks: List[int] | None = [])
Represents a SWITCH operation in the SOPHIA IR model.
Attributes
- opOpType
The operation type, set to OpType.SWITCH.
- switch_indexTokenizedOperandModel
The operand representing the index or variable used in the switch condition.
- switch_casesOptional[List[TokenizedOperandModel]], optional
A list of operand models representing the case values in the switch statement. Defaults to an empty list.
- switch_label_namesOptional[List[str]], optional
A list of label names associated with each case. Defaults to an empty list.
- switch_basic_blocksOptional[List[int]], optional
A list of basic block indices corresponding to each case. Defaults to an empty list.
- decompile()
Generate a human-readable or high-level representation.
Returns
- str
A string representation of the expression.
- property defined_tokenized_operands: List[TokenizedOperandModel]
Retrieve a list of tokenized operands defined at this switch statement.
Returns
- List[TokenizedOperandModel]
An empty list since a switch statement does not define new operands.
- update_operand_step_index() None
Update the step_index attribute for all operands in this switch statement.
- update_variables_defined_and_used_here() None
Update the lists of SSA and regular variables used in this switch statement.
- property used_tokenized_operands: List[TokenizedOperandModel]
Retrieve a list of tokenized operands used in this switch statement.
Returns
- List[TokenizedOperandModel]
A list of operands used in the switch index and case values, if they are variables.
- class eptalights_code.models.sophia_ir.function.ExprModel(*, expr_type: ExprType = ExprType.UNDEF, lhs: TokenizedOperandModel, rhs: TokenizedOperandModel | None = None)
Represents an expression in a program analysis context.
Attributes
- expr_typeExprType, optional
The type of the expression. Defaults to ExprType.UNDEF.
- lhsTokenizedOperandModel
The left-hand side operand of the expression.
- rhsTokenizedOperandModel, optional
The right-hand side operand of the expression, if applicable. Defaults to None for unary expressions.
Callsite
- class eptalights_code.models.sophia_ir.callsite.CallsiteModel(*, cid: str | None = None, step_index: int, fn_name: List[str], num_of_args: int = 0, variables_used_as_callsite_arg: List[str] = [], variables_defined_here: List[str] = [], ssa_variables_used_as_callsite_arg: List[str] = [], ssa_variables_defined_here: List[str] = [])
Represents a function call site.
Attributes
- cidOptional[str], optional
The unique identifier for the call site. Defaults to None.
- step_indexint
The index of the step in which this call occurs.
- fn_nameList[str]
The name of the function being called, stored as a list of strings.
- num_of_argsint, optional
The number of arguments passed to the function call. Defaults to 0.
- variables_used_as_callsite_argList[str], optional
A list of variable names that are used as arguments at the call site. Defaults to an empty list.
- variables_defined_hereList[str], optional
A list of variable names that are defined at the call site. Defaults to an empty list.
- ssa_variables_used_as_callsite_argList[str], optional
A list of SSA (Static Single Assignment) variables used as arguments. Defaults to an empty list.
- ssa_variables_defined_hereList[str], optional
A list of SSA variables defined at the call site. Defaults to an empty list.
- class eptalights_code.models.sophia_ir.callsite.CallsiteManagerModel(*, step_callsites: Dict[int, str] = {}, unique_callsites: Dict[str, List[str]] = {}, callsites: Dict[str, CallsiteModel] = {})
Manages function call sites.
Attributes
- step_callsitesdict[int, str], optional
A mapping from step indices to Callsite. Defaults to an empty dictionary.
- unique_callsitesdict[str, List[str]], optional
A mapping from function names to lists of their SSA function names. Defaults to an empty dictionary.
- callsitesdict[str, CallsiteModel], optional
A mapping from SSA variable names to their corresponding CallsiteModel instances. Defaults to an empty dictionary.
- all(name: str | None = None) List[CallsiteModel]
Retrieve all call sites or those matching a specific function name.
Parameters
- namestr, optional
The function name to filter by. If None, all call sites are returned.
Returns
- list[CallsiteModel]
A list of matching CallsiteModel instances.
- at_step(step_index: int) CallsiteModel | None
Retrieve a call site at a given step index.
Parameters
- step_indexint
The index of the step.
Returns
- CallsiteModel or None
The CallsiteModel instance if found, otherwise None.
- by_ssa_name(ssa_name: str) CallsiteModel | None
Retrieve a call site model by its SSA function name.
Parameters
- ssa_namestr
The SSA variable name of the call site.
Returns
- CallsiteModel or None
The corresponding CallsiteModel instance if found, otherwise None.
- property names: List[str]
Return a list of unique function names found in call sites.
Returns
- list[str]
A list of function names.
- search(name: str | None = None) List[CallsiteModel]
Search for call sites whose function names contain a given substring.
Parameters
- namestr, optional
The substring to search for in function names.
Returns
- list[CallsiteModel]
A list of matching CallsiteModel instances.
CFG
- class eptalights_code.models.sophia_ir.cfg.ControlFlowGraphModel(*, basicblock_exit_nodes: List[int] = [], basicblock_steps: Dict[int, List[int]] = {}, basicblock_edges: Dict[int, List[int]] = {})
Represents a control flow graph (CFG) model.
Attributes
- basicblock_exit_nodesList[int], optional
A list of basic block exit nodes. Defaults to an empty list.
- basicblock_stepsDict[int, List[int]], optional
A mapping of basic block indices to their corresponding step indices. Defaults to an empty dictionary.
- basicblock_edgesDict[int, List[int]], optional
A mapping of basic block indices to their successor basic blocks. Defaults to an empty dictionary.
Config
- class eptalights_code.models.sophia_ir.config.ConfigModel(*, project_id: str | None = None, extractor_output_path: str, code_type: str, storage_backend: str | None = 'sqlite3', local_database_path: str | None = None, output_decompiled_path: str | None = './__eptalights_decompiled_code/')
Represents the configuration settings for a code analysis process.
Attributes
- project_idstr, optional
The unique identifier for the project. Defaults to None.
- extractor_output_pathstr
The path to the extractor’s output directory.
- code_typestr
The type of code being processed (e.g., source, intermediate representation).
- storage_backendstr, optional
The storage backend used for extracted data. Defaults to sqLite3, with support for additional file-based databases planned.
- local_database_pathstr, optional
The path to the database storing extracted information. Defaults to None.
- output_decompiled_pathstr, optional
The destination path for storing decompiled code. Defaults to “./__eptalights_decompiled_code/”.
Variable
- class eptalights_code.models.sophia_ir.variable.SSAVariableModel(*, ssa_name: str | None = None, ssa_version: int = 0, variable_name: str | None = None, variable_defined_at_steps: list[int] = [], variable_used_at_steps: list[int] = [], variable_used_in_callsites: list[str] = [], record_attributes_defined_at_steps: Dict[int, List[str]] = {}, record_attributes_used_at_steps: Dict[int, List[str]] = {}, used_inside_other_tokenized_operand_tokens_at_step: Dict[int, List[str]] | None = {}, tokenized_operands_defs_at_steps: Dict[int, List[TokenizedOperandModel]] = {}, tokenized_operands_uses_at_steps: Dict[int, List[TokenizedOperandModel]] = {})
Represents an SSA (Static Single Assignment) variable in program analysis.
Attributes
- ssa_namestr, optional
The SSA name of the variable, if available. Defaults to None.
- ssa_versionint, optional
The SSA version of the variable. Defaults to 0.
- variable_namestr, optional
The original variable name before SSA transformation. Defaults to None.
- variable_defined_at_stepslist of int, optional
A list of step indices where the variable is defined. Defaults to an empty list.
- variable_used_at_stepslist of int, optional
A list of step indices where the variable is used. Defaults to an empty list.
- variable_used_in_callsiteslist of str, optional
A list of function call sites where this variable is used. Defaults to an empty list.
- record_attributes_defined_at_stepsdict of {int: list of str}, optional
A mapping of step indices to lists of record attributes defined at each step. Defaults to an empty dictionary.
- record_attributes_used_at_stepsdict of {int: list of str}, optional
A mapping of step indices to lists of record attributes used at each step. Defaults to an empty dictionary.
- used_inside_other_tokenized_operand_tokens_at_step :
dict of {int: list of str}, optional A mapping of step indices to lists of tokenized operand tokens in which this variable is used. Defaults to an empty dictionary.
- tokenized_operands_defs_at_steps :
dict of {int: list of TokenizedOperandModel}, optional A mapping of step indices to lists of tokenized operand definitions associated with this variable. Defaults to an empty dictionary.
- tokenized_operands_uses_at_steps :
dict of {int: list of TokenizedOperandModel}, optional A mapping of step indices to lists of tokenized operand uses associated with this variable. Defaults to an empty dictionary.
- class eptalights_code.models.sophia_ir.variable.VariableModel(*, vid: str | None = None, name: str, vartype: VarType = VarType.UNDEF, unique_ssa_variables: Dict[str, SSAVariableModel] = {}, full_declaration: str | None = None, type_declaration: str | None = None, type_props: List[str] | None = [], tokenized_type_declaration: TokenizedOperandModel | None = None, additional_info: Dict[str, Any] = {}, phi_ssa_variables: Dict[str, List[str]] = {})
Represents a variable in program analysis, including SSA details.
Attributes
- vidstr, optional
A unique identifier for the variable in the format
filename:function_name:variable_name. Defaults to None.- namestr
The name of the variable.
- vartypeVarType, optional
The type of the variable, defaulting to VarType.UNDEF.
- unique_ssa_variablesdict[str, SSAVariableModel], optional
A mapping of SSA variable names to their corresponding SSAVariableModel instances. Defaults to an empty dictionary.
- full_declarationstr, optional
The full declaration of the variable, if available.
- type_declarationstr, optional
The type declaration of the variable, if available.
- type_propslist[str], optional
A list of additional type properties. Defaults to an empty list.
- tokenized_type_declarationTokenizedOperandModel, optional
A tokenized representation of the variable’s type declaration.
- additional_infodict[str, Any], optional
Additional metadata about the variable.
- phi_ssa_variablesdict[str, list[str]], optional
A mapping of SSA variable names to lists of SSA versions used in phi functions. Defaults to an empty dictionary.
- decompile()
Generate a human-readable or high-level representation.
Returns
- str
A string representation of the expression.
- property defined_at_steps: List[int]
Get the unique execution steps where the variable is defined.
Returns
- list[int]
A list of unique step indices where this variable is defined.
- get_ssa_var(ssa_name: str) SSAVariableModel | None
Retrieve a specific SSA version of the variable.
Parameters
- ssa_namestr
The name of the SSA variable to retrieve.
Returns
- SSAVariableModel or None
The SSAVariableModel instance corresponding to ssa_name, or None if not found.
- property ssa_variables: List[SSAVariableModel]
Get all SSA versions of the variable.
Returns
- list[SSAVariableModel]
A list of SSAVariableModel instances representing SSA versions of this variable.
- class eptalights_code.models.sophia_ir.variable.VariableManagerModel(*, function_args: List[str] = [], local_variables: List[str] = [], tmp_variables: List[str] = [], return_variables: List[str] = [], variables: Dict[str, VariableModel] = {})
- Manages variables within a function’s scope, tracking
their usage and definitions.
Attributes
- function_argsList[str], optional
A list of function argument names. Defaults to an empty list.
- local_variablesList[str], optional
A list of local variable names. Defaults to an empty list.
- tmp_variablesList[str], optional
A list of temporary variable names. Defaults to an empty list.
- return_variablesList[str], optional
A list of variables representing function return values. Defaults to an empty list.
- variablesDict[str, VariableModel], optional
A dictionary mapping variable names to their corresponding VariableModel instances. Defaults to an empty dictionary.
- all() List[VariableModel]
Retrieve all stored variables.
Returns
- List[VariableModel]
A list of all VariableModel instances stored in variables.
- defined_at_step(step_index: int) List[VariableModel]
Retrieve variables that are defined at a given step.
Parameters
- step_indexint
The step index to check for variable definitions.
Returns
- List[VariableModel]
A list of VariableModel instances that were defined at the given step.
- get(name: str) VariableModel
Retrieve a variable by its name.
Parameters
- namestr
The name of the variable to retrieve.
Returns
- VariableModel or None
The corresponding VariableModel instance if found, otherwise None.
- property names: List[str]
Get the list of all variable names.
Returns
- List[str]
A list of all variable names stored in variables.
- search(name: str | None = None) List[VariableModel]
Search for variables whose names contain a given substring.
Parameters
- namestr, optional
The substring to search for in variable names.
Returns
- list[VariableModel]
A list of matching VariableModel instances.
- property ssa_names: List[str]
Get the list of all unique SSA (Static Single Assignment) variable names.
Returns
- List[str]
A list of unique SSA variable names across all stored variables.
- used_at_step(step_index: int) List[VariableModel]
Retrieve variables that are used at a given step.
Parameters
- step_indexint
The step index to check for variable usage.
Returns
- List[VariableModel]
A list of VariableModel instances that were used at the given step.
- used_or_defined_at_step(step_index: int) List[VariableModel]
Retrieve variables that are either used or defined at a given step.
Parameters
- step_indexint
The step index to check for variable usage or definition.
Returns
- List[VariableModel]
A list of VariableModel instances used or defined at the given step.
Tokenized Operands
- class eptalights_code.models.sophia_ir.tokenized_operand.TokenizedOperandModel(*, operand_type: TokenType | None = TokenType.IS_UNDEF, ssa_name: str | None = None, ssa_version: Annotated[int, Strict(strict=True)] = 0, variable_name: str | None = None, step_index: Annotated[int, Strict(strict=True)] | None = None, position: Annotated[int, Strict(strict=True)] = 0, used_inside_other_tokenized_operand_tokens_at_step: Dict[Annotated[int, Strict(strict=True)], List[str]] = {}, current_depth_position: Annotated[int, Strict(strict=True)] = 0, tokens: List[TokenModel] = [])
Represents a tokenized operand used within a specific step of program analysis.
Attributes
- operand_typeTokenType, optional
The type of operand. Defaults to
TokenType.IS_UNDEF.- ssa_namestr, optional
The SSA name for the operand.
- ssa_versionint
The version of the SSA name. Defaults to 0.
- variable_namestr, optional
The variable name for the operand.
- step_indexint, optional
The index of the step where the operand is used.
- positionint
The position of the operand within a specific context. Defaults to 0.
- used_inside_other_tokenized_operand_tokens_at_stepdict[int, list[str]], optional
A dictionary where keys represent step indices and values are lists of tokens used in other tokenized operands at those steps. Defaults to an empty dictionary.
- current_depth_positionint
The current depth position of the operand. Defaults to 0.
- tokenslist[TokenModel]
A list of tokens associated with the operand.
- _debug_visited_nodeslist[str], optional
For debugging purposes, tracks visited nodes during analysis. Defaults to an empty list.
- array_index_token_at_index(idx: int) TokenModel
Retrieves the array index token at a specified index.
Parameters
- idxint
The index of the array token to retrieve.
Returns
- TokenModel
The token at the specified index, or an empty
TokenModelif not found.
- array_index_token_values_iter() List[TokenModel]
Yields the values of tokens representing array indices.
Yields
- str
The value of the token representing an array index.
- array_index_tokens_iter() List[TokenModel]
Yields the tokens representing array indices.
Yields
- TokenModel
The token representing an array index.
- constant_index_tokens_iter() List[TokenModel]
Yields tokens that represent constants.
Yields
- TokenModel
The token representing a constant.
- constant_token_at_index(idx: int) TokenModel
Retrieves the constant token at a specified index.
Parameters
- idxint
The index of the constant token to retrieve.
Returns
- TokenModel
The token at the specified index, or an empty
TokenModelif not found.
- decompile()
Generate a human-readable or high-level representation.
Returns
- str
A string representation of the expression.
- get_field_attributes_used_in_tokens() List[str]
Extracts and returns the attribute values used in the tokens.
Returns
- list[str]
A list of attribute values from tokens with type
TokenType.IS_ATTRIBUTE.
- get_total_array_index_tokens() int
Returns the total number of array index tokens.
Returns
- int
The total number of array index tokens.
- has_constant_in_tokens() bool
Checks if any token in the operand is a constant.
Returns
- bool
True if a constant token is found, otherwise False.
- has_ssa_variable_extracted() bool
Checks whether the operand has an SSA variable extracted.
Returns
- bool
True if both variable_name and ssa_name are not None, otherwise False.
- model_post_init(context: Any, /) None
This function is meant to behave like a BaseModel method to initialise private attributes.
It takes context as an argument since that’s what pydantic-core passes when calling it.
- Args:
self: The BaseModel instance. context: The context.
- pretty_print_tokens()
Prints a pretty representation of the tokens.
This function converts the tokens to a dictionary format and prints them for debugging purposes.
- symbol_index_tokens_iter() List[TokenModel]
Yields tokens that represent symbols.
Yields
- TokenModel
The token representing a symbol.
- symbol_token_at_index(idx: int) TokenModel
Retrieves the symbol token at a specified index.
Parameters
- idxint
The index of the symbol token to retrieve.
Returns
- TokenModel
The token at the specified index, or an empty
TokenModelif not found.
- class eptalights_code.models.sophia_ir.tokenized_operand.TokenModel(*, token_type: TokenType = TokenType.IS_UNDEF, is_base_variable: bool = False, code_name: str | None = None, value: str | None = None, value_extended: str | None = None, discovery_depth: int = 0)
Represents a token model with attributes holding metadata.
Attributes
- model_configdict
Configuration dictionary for model validation settings, such as enabling assignment validation.
- token_typeTokenType
The type of the token, defaults to TokenType.IS_UNDEF.
- is_base_variablebool
Flag indicating whether the token is a base variable, defaults to False.
- code_namestr, optional
An optional string representing the code name of the token.
- valuestr, optional
An optional string representing the token’s value.
- value_extendedstr, optional
An optional extended value of the token.
- discovery_depthint
The depth of discovery for the token, defaults to 0.
Enum Types
- class eptalights_code.models.sophia_ir.enum_types.VarType(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)
Represents the type of a variable in program analysis.
Attributes
- FUNCTION_ARGUMENT
Represents a function argument.
- LOCAL_VARIABLE
Represents a local variable.
- TMP_VARIABLE
Represents a temporary variable.
- GLOBAL_VARIABLE
Represents a global variable.
- UNDEF
Represents an undefined variable type.
- class eptalights_code.models.sophia_ir.enum_types.TokenType(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)
An enumeration representing the token type of an element.
Attributes
- IS_UNDEF
Represents an undefined token type.
- IS_VARIABLE
Represents a variable token type. Example: Array access used with a variable index, e.g., a[n] where a and n are variables.
- IS_CONSTANT
Represents a constant token type. Example: Array access used with a constant index, e.g., a[10].
- IS_SYMBOL
Represents a symbol token type. Example: Symbols like [, ], *, ., ->, &, etc.
- IS_ATTRIBUTE
Represents an attribute token type. Example: Struct field access, e.g., st.data, st->data.
- IS_TYPE
Represents a type token type. Example: Type declaration, e.g., struct FILE fp.
- IS_FUNCTION
Represents a function token type. Example: Functions passed as arguments to calls, e.g., select_files and alphasort.
Example Usage:
int select_files(const struct dirent *dirbuf) { if (dirbuf->d_name[0] == '.') return 0; else return 1; } int alphasort(const struct dirent **a, const struct dirent **b) { return (strcmp((*a)->d_name, (*b)->d_name)); } int scandir( const char *dir, struct dirent ***namelist, int (*select) (const struct dirent *), int (*compar) (const struct dirent **, const struct dirent **) )
- IS_VARIABLE_AND_IS_FUNCTION
Represents both a variable and a function token type.
- class eptalights_code.models.sophia_ir.enum_types.OpType(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)
Enumeration of operation types.
Attributes
- NOP
No operation.
- ASSIGN
Assignment operation.
- CALL
Function or method call.
- RETURN
Return from a function or method.
- COND
Conditional operation (e.g., if-else).
- GOTO
Unconditional jump to another location.
- SWITCH
Switch-case operation.
- LABEL
Label for jump or branch operations.
- class eptalights_code.models.sophia_ir.enum_types.ExprType(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)
Numeration of expression types.
Attributes
- NO_EXPR
No expression.
- MULT_EXPR
Multiplication expression (lhs * rhs).
- PLUS_EXPR
Addition expression (lhs + rhs).
- MINUS_EXPR
Subtraction expression (lhs - rhs).
- RDIV_EXPR
Right division expression (lhs / rhs).
- DIV_EXPR
Division expression (lhs / rhs).
- MOD_EXPR
Modulo expression (lhs % rhs).
- GREATER_THAN_OR_EQUAL_EXPR
Greater than or equal to expression (lhs >= rhs).
- GREATER_THAN_EXPR
Greater than expression (lhs > rhs).
- LESS_THAN_EXPR
Less than expression (lhs < rhs).
- LESS_THAN_OR_EQUAL_EXPR
Less than or equal to expression (lhs <= rhs).
- EQUAL_EXPR
Equal to expression (lhs == rhs).
- NOT_EQUAL_EXPR
Not equal to expression (lhs != rhs).
- BITWISE_AND_EXPR
Bitwise AND expression (lhs && rhs).
- BITWISE_EXCLUSIVE_OR_EXPR
Bitwise exclusive OR expression (lhs & rhs).
- BITWISE_INCLUSIVE_OR_EXPR
Bitwise inclusive OR expression (lhs | rhs).
- BITWISE_NOT_EXPR
Bitwise NOT expression (~lhs).
- TRUNC_DIV_EXPR
Truncated division expression (lhs / rhs).
- TRUNC_MOD_EXPR
Truncated modulo expression (lhs % rhs).
- LSHIFT_EXPR
Left shift expression (lhs << rhs).
- RSHIFT_EXPR
Right shift expression (lhs >> rhs).
- RROTATE_EXPR
Right rotate expression ((((x) << (b)) | ((x) >> (32 - (b))))).
- NEGATE_EXPR
Negate expression (~lhs).
- MIN_EXPR
Minimum expression ((lhs < rhs)).
- MAX_EXPR
Maximum expression ((lhs > rhs)).
- POINTER_PLUS_EXPR
Pointer plus expression (lhs + rhs). This node represents pointer arithmetic. The first operand is always a pointer/reference type. The second operand is always an unsigned integer type compatible with sizetype. This is the only binary arithmetic operand that can operate on pointer types.
- FIX_TRUNC_EXPR
Fix trunc expression (conversion of floating-point value to an integer). These nodes represent conversion of a floating-point value to an integer. The single operand will have a floating-point type, while the complete expression will have an integral (or boolean) type. The operand is rounded towards zero.
dst[bool, int] = (floating-point type)rhs
- REALPART_EXPR
TODO
- IMAGPART_EXPR
TODO
- ABS_EXPR
TODO
- ABSU_EXPR
These nodes represent the absolute value of the single operand in equivalent unsigned type such that ABSU_EXPR of TYPE_MIN is well defined.
- SPACESHIP_EXPR
Maximum expression ((lhs <=> rhs)).
- UNDEF
Undefined expression.
File Metadata —
- class eptalights_code.models.sophia_ir.file_metadata.ClassMetadataModel(*, class_props: Dict[str, TokenizedOperandModel] = {}, class_methods: Dict[str, str] = {})
Represents metadata for a class within a program analysis context.
Attributes
- class_propsDict[str, TokenizedOperandModel]
A dictionary mapping property names to their tokenized representations. Represents the attributes (fields) of the class.
- class_methodsDict[str, str]
A dictionary mapping method names to their unique identifiers.
- class eptalights_code.models.sophia_ir.file_metadata.FileMetadataModel(*, filepath: str, classes: Dict[str, ClassMetadataModel] = {}, functions: Dict[str, str] = {})
Represents metadata for a file within a program analysis context.
Attributes
- filepathstr
The path to the file being analyzed.
- classesDict[str, ClassMetadataModel]
A dictionary mapping class names to their metadata models.
- functionsDict[str, str]
A dictionary mapping function names to their unique identifiers.
File Data —
- class eptalights_code.models.sophia_ir.file_data.ClassDataModel(*, class_props: Dict[str, TokenizedOperandModel] = {}, class_methods: Dict[str, FunctionModel] = {})
Represents a class within a program analysis context.
Attributes
- class_propsDict[str, TokenizedOperandModel]
A dictionary mapping property names to their tokenized representations. Represents the attributes (fields) of the class.
- class_methodsDict[str, FunctionModel]
A dictionary mapping method names to their corresponding function models. Represents the functions (methods) of the class.
- class eptalights_code.models.sophia_ir.file_data.FileDataModel(*, filepath: str, classes: Dict[str, ClassDataModel] = {}, functions: Dict[str, FunctionModel] = {})
Represents a file within a program analysis context.
Attributes
- filepathstr
The path to the file being analyzed.
- classesDict[str, ClassDataModel]
A dictionary mapping class names to their corresponding class models.
- functionsDict[str, FunctionModel]
A dictionary mapping function names to their corresponding function models.
Dataflow
- class eptalights_code.models.sophia_ir.dataflow.SinkResultType(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)
Enumeration representing possible sink result types.
Attributes
- CONTINUESinkResultType
Indicates that the process should continue.
- OKSinkResultType
Indicates a successful result.
- STOPSinkResultType
Indicates that the process should stop.
- class eptalights_code.models.sophia_ir.dataflow.DataflowEventModel(*, op: OpType, lineno: int, variable_name: str, ssa_variable_name: str, ssa_version: int, data_direction: str | None = None, var_depth_pos: int, step_index: int, record_attributes_defined_here: List[str] | None = [], record_attributes_used_here: List[str] | None = [], used_inside_other_tokenized_operand_tokens_here: bool = False, current_record_attibute_tracked: str | None = None)
Represents a dataflow event in the dataflow analysis.
Attributes
- opOpType
The operation type associated with this dataflow event.
- linenoint
The line number where this dataflow event occurs.
- variable_namestr
The name of the variable involved in this event.
- ssa_variable_namestr
The SSA (Static Single Assignment) name of the variable.
- ssa_versionint
The SSA version number of the variable.
- data_directionstr, optional
The direction of data flow (e.g., “read” or “write”). Defaults to None.
- var_depth_posint
Internal property for debugging variable position.
- step_indexint
The step index of the program execution where this event occurs.
- record_attributes_defined_hereList[str], optional
A list of record attributes that are defined at this event. Defaults to an empty list.
- record_attributes_used_hereList[str], optional
A list of record attributes that are used at this event. Defaults to an empty list.
- used_inside_other_tokenized_operand_tokens_herebool, optional
Indicates whether this variable is used inside other tokenized operand tokens. Defaults to False.
- current_record_attibute_trackedstr, optional
Current record attribute being tracked. Defaults to None.
- class eptalights_code.models.sophia_ir.dataflow.DataflowStateModel(*, current_event: DataflowEventModel, current_function: FunctionModel, current_step: SophiaIRNopModel | SophiaIRAssignModel | SophiaIRCallModel | SophiaIRCondModel | SophiaIRReturnModel | SophiaIRGotoModel | SophiaIRSwitchModel, previous_events: List[DataflowEventModel] = [])
Represents the state of data flow analysis at a specific point in execution.
Attributes
- current_eventDataflowEventModel
The current data flow event being processed.
- current_functionFunctionModel
The function in which the current event is occurring.
- current_stepUnion[
function_model.SophiaIRNopModel, function_model.SophiaIRAssignModel, function_model.SophiaIRCallModel, function_model.SophiaIRCondModel, function_model.SophiaIRReturnModel, function_model.SophiaIRGotoModel, function_model.SophiaIRSwitchModel
- ]
The current step in the execution, represented by one of the SOPHIA IR models.
- previous_eventsList[DataflowEventModel], optional
A list of previous data flow events leading up to the current state. Defaults to an empty list.
- class eptalights_code.models.sophia_ir.dataflow.DataflowPathModel(*, events: List[DataflowEventModel] = [], passthru_callsites: List[str] = [], data_mutation_count: int = 0)
Represents a sequence of data flow events in a program analysis context.
Attributes
- eventsList[DataflowEventModel], optional
A list of data flow events that constitute this path. Defaults to an empty list.
- passthru_callsitesList[str], optional
A list of function call sites that the data flow passes through. Defaults to an empty list.
- data_mutation_countint
The number of times data was mutated. Defaults to 0.
- class eptalights_code.models.sophia_ir.dataflow.DataflowRequestModel(*, function: FunctionModel, source_variable_name: str, sink_callback_fn: Callable[[DataflowStateModel], SinkResultType], start_from_step_index: int | None = None, timeout_secs: int = 180, strict_record_attributes_tracking: bool = True)
Represents a request for data flow analysis.
Attributes
- source_variable_namestr
The name of the variable that serves as a data flow source.
- start_from_step_indexint, optional
The index of the step from which the data flow source starts. If not provided, it defaults to None.
- sink_callback_fnCallable[[DataflowStateModel], SinkResultType]
A function that processes a DataflowStateModel and returns a SinkResultType. This function represents the sink in the data flow analysis.
Example:
def reachability_to_malloc_sink( state: DataflowStateModel ) -> SinkResultType: if ( state.current_event.op == models.OpType.CALL and state.current_step.fname == "malloc" ): return models.SinkResultType.OK return models.SinkResultType.CONTINUE
- functionFunctionModel
The function model representing the target function for data flow analysis.
- timeout_secsint, optional
The maximum time (in seconds) allowed for the analysis. Defaults to 180 seconds (3 minutes). Maximum is 600 seconds (10 minutes).
- strict_record_attributes_trackingbool
If True, strictly tracks data within complex records or variables, following only specific attributes rather than the base variable.
Example:
# Given the following code: a.attr = 10 x = a print(a.something_else) # If `strict_record_attributes_tracking` is False, the data flow path will # include `print(a.something_else)`, since it tracks `a` as a whole. # However, if `strict_record_attributes_tracking` is True, the data flow # path will **not** include `print(a.something_else)`, as it focuses only # on `a.attr`. # Another case: a.attr = 10 x = a print(a.attr) # Here, the `print(a.attr)` statement **will** be included in the data # flow path, as it matches the tracked attribute. # Using strict attribute tracking is encouraged for accuracy, unless # attributes cause side effects or a different behavior is required.
- class eptalights_code.models.sophia_ir.dataflow.DataflowResponseModel(*, status: bool, paths: List[DataflowPathModel] = [], error_message: str | None = None)
Represents the response model for a dataflow analysis request.
Attributes
- statusbool
Indicates whether the dataflow analysis was successful.
- pathsList[DataflowPathModel], optional
A list of dataflow paths resulting from the analysis. Defaults to an empty list.
- error_messageOptional[str], optional
An error message if the analysis failed. Defaults to None.
- class eptalights_code.models.sophia_ir.dataflow.DataflowActionModel(*, action_id: Annotated[UUID, UuidVersion(uuid_version=4)], request_hash: str, status: DataflowActionStatusType, request: DataflowRequestModel, response: DataflowResponseModel | None = None, data_created: str)
Represents the model for a dataflow action request and its response.
Attributes
- action_idUUID4
Unique identifier for the dataflow action.
- request_hashstr
Hash value of the request payload, used for deduplication or caching.
- requestDataflowRequestModel
The original request model containing parameters for the dataflow action.
- responseOptional[DataflowResponseModel], optional
The response model containing the results of the dataflow action. Defaults to None if no response is available yet.
- data_createdstr
Timestamp indicating when the dataflow action was created.