DARPA’s Guaranteed Architecture for Physical Systems (GAPS) is a research program that addresses software and hardware for compartmentalized applications where multiple parties with strong physical isolation of their computational environment, have specific constraints on data sharing (possibly with redaction requirements) with other parties, and any data exchange between the parties is mediated through a guard that enforces the security requirements.
Peraton Labs’ Cross-domain Language extensions for Optimal SecUre Refactoring and Execution (CLOSURE) project is building a toolchain to support the development, refactoring, and correct-by-construction partitioning of applications and configuration of the guards. Using the CLOSURE approach and toolchain, developers will express security intent through annotations applied to the program, which drive the program analysis, partitioning, and code auto-generation required by a GAPS application.
Problem: The machinery required to verifiable and securely establish communication between cross-domain systems (CDS) without jeopardizing data spillage is too complex to implement for many software platforms where such communication would otherwise be desired. To regulate data exchanges between domains, network architects rely on several risk mitigation strategies including human fusion of data, diodes, and hypervisors which are insufficient for future commercial and government needs as they are high overhead, customized to specific setups, prone to misconfiguration, and vulnerable to software/hardware security flaws. To streamline the design, development, and deployment of provably secure CDSs, new hardware and software co-design tools are needed to more effectively build cross-domain support directly into applications and associated hardware early in the development lifecycle.
Solution: Peraton Labs is developing CLOSURE (Cross-domain Language-extensions for Optimal SecUre Refactoring and Execution) to address the challenges associated with building cross-domain applications in software. CLOSURE extends existing programming languages by enabling developers the ability to express security intent through overlay annotations and security policies such that an application can be compiled to separable binaries for concurrent execution on physically isolated platforms.
The CLOSURE compiler toolchain interprets annotation directives and performs program analysis of the annotated program and produces a correct-by-construction partition if feasible. CLOSURE automatically generates and inserts serialization, marshaling, and remote-procedure call code for cross-domain interactions between the program partitions.
As stated in [1], “writing a tool capable of assuring a property that falls out of the scope of the standard type checking performed by a compiler is a task comparable to writing the compiler itself.” This is especially true with respect to C++ where an extensive list of object-oriented features makes it difficult to conduct the required program analysis to track the control and data dependencies incurred by multiple inheritance, polymorphism, templates, exceptions, overrides, etc. Unlike C and Java for which extensive research and tools for formal dependency modeling exists [2] [3], the set of such tools for C++ are sparse. While a large subset of the C language has been formalized [4] [5], formalisms for analysis of C++ programs [1] [6] handle a limited subset of the C++ language. For example, cpp2v [7] does not support templates, virtual inheritance, floating point numbers, and many other critical features. Other approaches to formalize C++ for analysis exhibit other limitations that fall short of their applicability to real-world C++ programs. The alternative approach of defining type extensions and rewriting the program based on those extensions so that the type checker can check for properties is not pragmatic due to the learning curve, lack of tooling support for non-standard extensions, and cost of rewriting legacy programs in the new dialect. Much of the CLOSURE analyses occur at the LLVM-IR level, and that generated by C++ is far more difficult to understand without proper tools (names are mangled, classes and objects are refactored, etc.). Further complicating the effort is the lack of reflection in C++ (afforded to us in Java) for explicitly describing the types of fields and structure of opaque objects as they are marshalled and serialized for streaming communications across guard devices.
In the final months of GAPS Phase 3, we turned our focus to that of a preliminary CLOSURE C++ toolchain proof-of-concept design. In the following sections, we document our findings, initial approach, and early implementation. Our proof-of-concept toolchain builds around a modern C++ frontend AST, though we stick to programs of C++98 complexity, incrementally adding support for modern features as they become of interest. Several of the toolchain steps are conducted manually with plans to automate them in future research.
Our approach follows the same high-level workflow that has proven successful for us in C and Java:
The CLOSURE architecture for C++ follows the same workflow that has proven successful for C and Java with language-specific differences in annotation, modeling, constraint, and code-generation steps as illustrated in Figure :
The key sub-modules of the C++ toolchain include:
Notable differences between the C++ proof-of-concept toolchain and its C/Java counterparts include:
Further details of the workflow are presented in following section.
Like the CLOSURE architecture, the CLOSURE workflow for C++ mimics the workflow that has proven successful for C and Java with language-specific differences in tooling. The Clang API is used, along with SVF for pointer analysis, to interpret annotations and generate the program dependency model. Along with MiniZinc, Z3 has been added as a second constraint solver and is useful for diagnostics. The workflow is illustrated in Figure :
The workflow is defined for a single source cpp follows:
.csv
files nodes.csv
and
edges.csv
that contains the nodes and edges of the
graph.clang
and
use our SVF-derived tool called DumpPTG
to extract
points-to relations between LLVM registers for each generated LLVM
function from the C++ source code. The output is a Points-To Graph, a
pair of csv files svf_nodes.csv
and
svf_edges.csv
where the nodes are memory locations
corresponding to LLVM registers (e.g., %4
or
@a
) and the edges are points-to relations.extract-declares
to
translate each of the points-to relations outputted the previous step
back to declarations at the program graph/C++ source level. By default,
each LLVM register which correspondsllvm.dbg.declare
relating the two, so that information is
used to create a map between certain LLVM registers and their
corresponding C++ declarationsextract-declares
program is run, we can use
another small python program points_to_edges.py
to
translate the points-to edges in the Points-To Graph to edges between
declarations in the program graph. These edges are added as new
PointsTo
edges into edges.csv
solver/main.py
takes in as input the
modified program with points to edges and the cle-json and outputs a
constraint satisfaction problem for Minizinc or optionally Z3.As of this prototype release, CLOSURE’s C++ toolchain supports up to the program division step for a limited subset of C++ language constructs, with automated generation of RPCs, DFDLs, and HAL configuration left for future research.
In the rest of this document, we first present a quick start guide followed by a detailed usage of the toolchain components. For each component, we describe what it does, provide some insight into how it works, discuss inputs and outputs and provide invocation syntax for usage. We conclude with a discussion of the limitations of our current toolchain and a roadmap for future work. We provide samples of significant input and output files in the appendices and provide a list of bibliographic references at the end.
The quickest way to get started with the C++ toolchain is using the provided Dockerfile to build an image after cloning the repository.
git clone https://github.com/gaps-closure/cpp-closure
cd cpp-closure
docker build -t gapsclosure/cpp-closure:develop .
Afterwards, one can use the Remote - Containers extension with VSCode to work within the CLE environment for C++. Alternatively, a docker image can be run via the command line:
docker run -it gapsclosure/cpp-closure:develop
Within the docker container, there will be a number of utilities,
libraries and python scripts under /opt/closure
. The
provided binaries should be part of the user’s $PATH
and
should be accessible from the command line.
The CLE annotations in a C++ program look different from that of C
programs. Every declaration can be annotated using
__attribute__((cle_annotate("LABEL_NAME")))
where
"LABEL_NAME"
is a reference to a CLE definition. A CLE
definition is defined using the
#pragma cle def <CLE_JSON>
, exactly the same as it is
in C.
When annotating a declaration, usually the
__attribute__((cle_annotate("ORANGE")))
can be positioned
in front of the declaration, or on the line above, like so:
((cle_annotate("ORANGE")))
__attribute__= Foo(); Foo foo
The only exception to this is class
annotations, where
the attribute statement must be placed after the class
keyword, like so:
((cle_annotate("PURPLE_SHAREABLE"))) Bar
class __attribute__{
...
};
Below are sample CLE definitions for ORANGE
and
PURPLE_SHAREABLE
for reference:
#pragma cle def ORANGE {"level":"orange"}
#pragma cle def PURPLE_SHAREABLE {"level":"purple",\
"cdf": [\
{"remotelevel":"orange", \
"direction": "egress", \
"guarddirective": { "operation": "allow"}}\
] }
An annotation will have its CLE label appear in the exported program
graph from the clang-plugin. For example, the following annotation
"ORANGE"
on the construction of Foo()
((cle_annotate("ORANGE")))
__attribute__= Foo(); Foo foo
will appear in the nodes.csv
file, like so:
1,Decl.Var,"foo",ORANGE,4,,4,,Foo.cpp,245,301
...
Additionally, the clang-plugin will output the CLE definitions in a
collated.json
file. For example, our definition for
ORANGE
will appear in the json in the following manner:
[{"cle-label": "PURPLE","cle-json":{"level":"purple"}},
...
]
The annotation labels and definitions are eventually used to create the constraint solver instance, either in Minizinc or Z3.
The following is table of node and edges types, a description of their purpose and where that information came from (either the Clang AST or SVF).
Node Name | Node Description |
---|---|
Decl.Var | All variable declarations. |
Decl.Function | A static global variable inside a function |
Decl.Record | A class, struct or union definition. |
Decl.Field | A record field. |
Decl.Method | A record method. Includes operator overloads |
Decl.Param | Formal parameter of a function/method. Contains information about parameter index |
Decl.Constructor | A constructor for a class/struct |
Decl.Destructor | A destructor for a class/struct |
Stmt.Decl | A statement which also has a declaration |
Stmt.Call | An clang statement that represents a call to a function, method, constructor or destructor. |
Stmt.Compound | An clang statement that only has other statements as children. |
Stmt.Ref | An clang statement which holds a reference to a decl |
Stmt.Field | An clang statement which is a field accessor |
Stmt.This | An clang statement which refers to ‘this’ |
Stmt.Return | A return within a given function |
Stmt.Other | An clang statement that does not transfer control to another function. |
Edge Name | Edge Description |
---|---|
Struct.Field | Connects a record to its field |
Struct.Method | Connects a record to its method |
Struct.Constructor | Connects a record to its constructor |
Struct.Destructor | Connects a record to its destructor |
Struct.Inherit | Connects two records by inheritance relation. Type of inheritance given by a property |
Struct.Param | Connects a function-like object to its parameters |
Struct.Child | When a statement has a child not described by the above. Can be pruned out of the graph |
Control.Entry | Connects a function-like object to it’s body |
Control.FunctionInvocation | A direct call invocation to a C-style function |
Control.MethodInvocation | A method call |
Control.ConstructorInvocation | A constructor call |
Control.DestructorInvocation | A call to a destructor |
Data.DefUse | A def-use relation between statements |
Data.ArgPass | An argument pass to a call instruction. Destination has a paramater index |
Data.Return | Connects a call to its resulting value at the call site |
Data.Object | Connects a method call/field to the object being called upon (e.g. f as in f.foo()) |
Data.FieldAccess | A field access |
Data.InstanceOf | Connects a ‘this’ instance to a class definition |
Data.Decl | Connects a decl statment to a decl |
Data.PointsTo | A points-to relation |
Edge Name | Source Type | Destination Type |
---|---|---|
Struct.Field | Decl.Record | Decl.Field |
Struct.Method | Decl.Record | Decl.Method |
Struct.Constructor | Decl.Record | Decl.Constructor |
Struct.Destructor | Decl.Record | Decl.Destructor |
Struct.Inherit | Decl.Record | Decl.Record |
Struct.Param | Decl | Decl.Param |
Struct.Child | Stmt | Stmt |
Control.Entry | Decl | Stmt |
Control.FunctionInvocation | Stmt.Call | Decl.Function |
Control.MethodInvocation | Stmt.Call | Decl.Method |
Control.ConstructorInvocation | Stmt.Call | Decl.Constructor |
Control.DestructorInvocation | Stmt.Call | Decl.Destructor |
Data.DefUse | Stmt.Decl | Stmt.Ref |
Data.ArgPass | Stmt.Call | Stmt |
Data.Return | Stmt.Call | Stmt |
Data.Object | Stmt.Call/Stmt.Field | Stmt |
Data.FieldAccess | Stmt | Decl.Field |
Data.InstanceOf | Stmt.This | Decl.Record |
Data.Decl | Stmt.Decl | Decl |
Data.PointsTo (from SVF+AST) | Decl | Decl |
See the C documentation [10] for description, purpose, and usage of the conflict analyzer. For the purposes of this documentation, we focus on the C++ CLOSURE constraint model in the text below.
In this section, we present an informal statement of constraints to be enforced by our conflict analyzer. We then present the main constraints coded in MiniZinc used by our model to achieve these constraints. More information about MiniZinc including its usage and syntax can be found in [11].
In the model below, the nodeEnclave
decision variable
stores the enclave assignment for each node, the taint
decision variable stores the label assignment for each node, and the
xdedge
decision variable stores whether a given edge is in
the enclave cut (i.e., the source and destination nodes of the edge are
in different enclaves). Several other auxiliary decision variables are
used in the constraint model to express the constraints or for efficient
compilation.
Every node must be assigned to a valid enclave. Functions and classes must be assigned to the same enclave as their containing node. The level of the label (taint) of every node must match the level of the enclave the node is assigned to.
constraint :: "NodeHasEnclave" forall (n in NodeIdx) (nodeEnclave[n] != nullEnclave);
constraint :: "NodeEnclaveIsFunEnclave" forall (n in NodeIdx) ((hasFunction[n] != 0) -> (nodeEnclave[n] == nodeEnclave[hasFunction[n]]));
constraint :: "NodeEnclaveIsClassEnclave" forall (n in NodeIdx) ((hasClass[n] != 0) -> (nodeEnclave[n] == nodeEnclave[hasClass[n]]));
constraint :: "NodeLevelAtEnclaveLevel" forall (n in NodeIdx) (hasLabelLevel[taint[n]] == hasEnclaveLevel[nodeEnclave[n]]);
Only functions, methods, and constructors can be assigned a function annotation. Function annotations can only be made by the user.
constraint :: "FnAnnotationForFnOnly"
forall (n in NodeIdx) (isFunctionAnnotation[taint[n]] -> isFunction(n));
constraint :: "FnAnnotationByUserOnly"
forall (n in FunOrMethodOrCon) (isFunctionAnnotation[taint[n]] -> userAnnotatedNode[n]);
constraint :: "annotationOnFunctionIsFunAnnotation"
forall (n in FunOrMethodOrCon) (userAnnotatedNode[n] -> isFunctionAnnotation[taint[n]]);
Annotations on classes and their fields are node annotations, not function annotations.
constraint :: "annotationOnClassIsNodeAnnotation"
forall (n in Decl_Record) (userAnnotatedNode[n] -> not (isFunctionAnnotation[taint[n]]));
constraint :: "annotationOnFieldIsNodeAnnotation"
forall (n in Decl_Field) (userAnnotatedNode[n] -> not (isFunctionAnnotation[taint[n]]));
All nodes in an un-annotated function must have the taint of the function. All nodes in an un-annotated class must have the taint of the class. All nodes in an un-annotated class must be un-annotated and have the taint of the class.
constraint :: "UnannotatedFunContentTaintMatch"
forall (n in NodeIdx where hasFunction[n] != 0) ((not userAnnotatedNode[hasFunction[n]]) -> (taint[n] == ftaint(n)));
constraint :: "UnannotatedClassTaintsMatch"
forall (n in NodeIdx where hasClass[n] != 0)
((not userAnnotatedNode[hasClass[n]]) -> (taint[n] == taint[hasClass[n]]));
constraint :: "noAnnotatedDataForUnannotatedClass"
forall (n in NodeIdx where hasClass[n] != 0)
((not userAnnotatedNode[hasClass[n]]) -> (taint[n] == taint[hasClass[n]]));
constraint :: "noAnnotatedDataForUnannotatedClass"
forall (n in NodeIdx where hasClass[n] != 0)
((not userAnnotatedNode[hasClass[n]]) -> (not userAnnotatedNode[n]));
Un-annotated constructors, destructors, and methods must have the taint of the class.
constraint :: "unannotatedConstructorGetsClassTaint"
forall (n in Decl_Constructor)
((hasClass[n] != 0 /\ not userAnnotatedNode[n]) -> (taint[n] == taint[hasClass[n]]));
constraint :: "unannotatedDestructorGetsClassTaint"
forall (n in Decl_Destructor)
((hasClass[n] != 0 /\ not userAnnotatedNode[n]) -> (taint[n] == taint[hasClass[n]]));
constraint :: "unannotatedMethodGetsClassTaint"
forall (n in Decl_Method)
((hasClass[n] != 0 /\ not userAnnotatedNode[n]) -> (taint[n] == taint[hasClass[n]]));
Classes connected by an inheritance relationship share the same taint.
constraint :: "inheritTaint"
forall (e in Record_Inherit) (esTaint(e) == edTaint(e));
All nodes in an annotated function must have a taint in the ARCtaints.
constraint :: "AnnotatedFunContentCoercible"
forall (n in NodeIdx where (hasFunction[n] != 0) /\ (not isFunction(n)))
(userAnnotatedNode[hasFunction[n]] -> isInArctaint(ftaint(n), taint[n], hasLabelLevel[taint[n]]));
Annotated constructors (which are only valid in annotated classes) must have the class taint as their sole rettaint in every CDF.
constraint :: "annotatedConstructorReturnsClassTaint"
forall (n in Decl_Constructor)
(userAnnotatedNode[n] ->
(forall (lvl in Level)
((cdfForRemoteLevel[taint[n], lvl] != nullCdf) -> hasRettaints[cdfForRemoteLevel[taint[n], lvl], taint[hasClass[n]]])));
Any function whose address is taken in the program cannot have a function annotation.
constraint :: "FunctionPtrSinglyTainted"
forall (e in Data_PointsTo) (isFunction(hasDest[e]) -> not userAnnotatedNode[hasDest[e]]);
The control flow can never leave an enclave, unless it is done through an approved cross-domain call, as expressed in the following constraints. The only control edges allowed in the cross-domain cut are either call invocations or returns. For any call invocation edge in the cut, the function annotation of the function entry being called must have a CDF that allows (with or without redaction) the level of the label assigned to the callsite. The label assigned to the callsite must have a node annotation with a CDF that allows the data to be shared with the level of the (taint of the) function entry being called.
constraint :: "NonCallRetControlEnclaveSafe"
forall (e in Control_EnclaveSafe) (xdedge(e) == false);
constraint :: "XDCallBlest"
forall (e in Control_Invocation) (xdedge(e) -> userAnnotatedNode[hasDest[e]]);
constraint :: "XDCallAllowed"
forall (e in Control_Invocation) (xdedge(e) -> allowOrRedact(edFunCdf(e)));
constraint :: "XDReturnAllowed"
forall (e in Control_Return) (xdedge(e) -> allowOrRedact(esFunCdf(e)));
Note: The conflict analyzer is working with the annotated
unpartitioned code and not the fully partitioned code which will
includes autogenerated code. The actual cut in the partitioned code with
autogenerated code to handle cross-domain communications will be between
the cross-domain send and receive functions that are several steps
removed from the cut in the xdedge
variable at this stage
of analysis. The autogenerated code will apply annotations to
cross-domain data annotations that contain GAPS tags, and they will have
a different label. So we cannot check whether the label of the arguments
passed from the caller matches the argument taints allowed by the called
function, or if the return taints match the value to which the return
value is assigned. A downstream verification tool will check this.
Data can only leave an enclave through parameters or return of valid cross-domain call invocations, as expressed in the following constraints.
Cross-domain returns must be allowed by the CDF, but the taints need not match (can’t enforce taint matching as the level must change). Cross-domain Points-To edges must be allowed by the CDF, but the taints need not match.
All children of returns must have the same taint as the return statement. Return statements in annotated functions must have a taint allowed by the CDF.
Note: The reason cross domain points-to edges are not disallowed completely is because passing arrays necessitates cross-domain points-to edges.
constraint :: "EnclaveSafeDataEdges"
forall (e in Data_EnclaveSafe) (xdedge(e) == false);
constraint :: "ReturnChildMatchesReturn"
forall (n in Stmt_Return) (
forall (c in NodeIdx) (ancestor(n, c) -> taint[n] == taint[c])
);
constraint :: "ReturnStmtMatchesRettaint"
forall (n in Stmt_Return where annotFun(n)) (
hasRettaints[funCdf(n), taint[n]]
);
constraint :: "XDPointsToAllowed"
forall (e in Data_PointsTo)
(xdedge(e) -> (allowOrRedact(cdfForRemoteLevel[edTaint(e), hasLabelLevel[esTaint(e)]]) /\ not isFunction(hasDest[e])));
While the constraints on the control dependency and data dependency that we discussed governed data sharing at the cross-domain cut, we still need to perform taint checking to ensure that data annotated with different labels inside each enclave are managed correctly and only when the mixing of the taints is explicitly allowed by the user.
Labels can be coerced (i.e., nodes of a given PDG edge can be permitted to have different label assignments) inside an enclave only through user annotated functions.
Any data dependency or parameter edge that is intra-enclave (not in the cross-domain cut) and with different CLE label taints assigned to the source and destination nodes must be coerced (through an annotated function).
One may wonder whether a similar constraint must be added for control
dependency edges at the entry block for completeness. Such a constraint
is not necessary given our inclusion of the
UnannotatedFunContentTaintMatch
and
AnnotatedFunContentCoercible
constraints discussed earlier.
However, pointer dependencies are captured by a chain of points-to edges
which may be intra-function edges. Therefore, we restrict intra-function
points-to edges to have the same taint, even in annotated functions.
predicate intraFunEdge(DataEdge: e) =
(hasFunction[hasSource[e]] != 0 /\ hasFunction[hasDest[e]] != 0 /\
hasFunction[hasSource[e]] == hasFunction[hasDest[e]]);
constraint :: "intraFunPointsToTaintsMatch"
forall (e in Data_PointsTo) (intraFunEdge(e) -> esTaint(e) == edTaint(e));
For ANY data edge between two function-external nodes, the taints must match.
predicate globalGlobalEdge(DataEdge: e) = (isGlobal[hasSource[e]] /\ isGlobal[hasDest[e]]);
constraint :: "externExternDataEdgeTaintsMatch"
forall (e in DataEdge) (globalGlobalEdge(e) -> (esTaint(e) == edTaint(e)));
For ANY data edge between data within a function and a function-external node, the taints must match.
predicate srcFunExternEdge(DataEdge: e) = (hasFunction[hasSource[e]] != 0 /\ hasFunction[hasDest[e]] == 0);
predicate destFunExternEdge(DataEdge: e) = (hasFunction[hasDest[e]] != 0 /\ hasFunction[hasSource[e]] == 0);
predicate funExternEdge(DataEdge: e) = (srcFunExternEdge(e) \/ destFunExternEdge(e));
constraint :: "externDataEdgeTaintsMatch"
forall (e in DataEdge) (funExternEdge(e) -> esTaint(e) == edTaint(e));
This leaves taint propagation for inter-function data edges. For non-XD return edges from a callee function to a callsite, if the callee (source) function is un-annotated, the taints must match and if the callee (source) function is annotated, the taint on the dest node must be in the callee’s rettaints.
constraint :: "returnFromUnannotatedTaintsMatch"
forall (e in Control_Invocation where not destAnnotFun(e))
(taintsAgree(esTaint(e), edTaint(e)));
constraint :: "returnInRettaints"
forall (e in Control_Invocation where destAnnotFun(e) /\ not xdedge(e))
(hasRettaints[edFunCdf(e), esTaint(e)]);
constraint :: "callsiteTaintMatchesAllReturns"
forall (e in Control_Invocation where destAnnotFun(e) /\ not xdedge(e))
(forall (r in Stmt_Return where hasFunction[r] == hasDest[e])
(taintsAgree(taint[r], esTaint(e))));
For non-XD argument passing edges, if the destination function is un-annotated, the taint of the argument must match the taint of the destination function and if the destination function is annotated, the taint of the argument must be in the argtaints of the function at that parameter index. All children of anything passed as an argument must have the same taint as the argument node.
constraint :: "argumentToUnannotatedTaintsMatch"
forall (e in Control_Invocation where not destAnnotFun(e))
(forall (arg_e in Data_ArgPass where hasSource[arg_e] == hasSource[e])
(edTaint(arg_e) == edTaint(e)));
constraint :: "argumentInArgtaints"
forall (e in Control_Invocation where destAnnotFun(e) /\ not xdedge(e))
(forall (arg_e in Data_ArgPass where hasSource[arg_e] == hasSource[e])
(hasArgtaints[edFunCdf(e), hasParamIdx[hasDest[arg_e]], taint[hasDest[arg_e]]]));
constraint :: "argumentChildren"
forall (e in Data_ArgPass) (
forall(n in NodeIdx where ancestor(hasDest[e], n))
(taint[n] == edTaint(e))
);
Inter-function points-to edges should always have the same taint.
predicate interFunEdge(DataEdge: e) =
(hasFunction[hasSource[e]] != 0 /\ hasFunction[hasDest[e]] != 0 /\
hasFunction[hasSource[e]] != hasFunction[hasDest[e]]);
constraint :: "interFunPointsToTaintsMatch"
forall (e in Data_PointsTo)
((not xdedge(e) /\ interFunEdge(e)) -> esTaint(e) == edTaint(e));
Lastly, a node has the ALL label if and only if it is contained in an unpinned class. A class is pinned if: the class is annotated, anything in the class is annotated, or it depends on anything external to the class.
constraint :: "definitionForALL"
forall (n in NodeIdx)
(((hasClass[n] != 0 /\ not isPinned(hasClass[n])) \/ (isClass(n) /\ not isPinned(n))) == (taint[n] == ALL));
In this model, we require the solver to provide a satisfying assignment that minimizes the total number of call invocations that are in the cross-domain cut. Other objectives could be used instead.
var int: objective = sum(e in ControlDep_CallInv, l in nonNullEnclave where xdedge[e,l])(1);
solve minimize objective;
Once the CAPO partitioning conflict analyzer has analyzed the CLE-annotated application code and determined that all remaining conflicts are resolvable by RPC-wrapping to result in a security-compliant cross-domain partitioned code, the conflict analyzer will produce a topology file (JSON) containing the assignment of every class to an enclave/level. A full-length version of the topology can be found in the appendix
Once the CAPO partitioning conflict analyzer has analyzed the CLE-annotated application code, and determined that all remaining conflicts are resolvable by RPC-wrapping to result in a security compliant cross-domain partitioned code, the conflict analyzer will save the code in the refactored directory along with a topology file (JSON) containing the assignment of every function and global variable to an enclave/level. A sample topology JSON is provided below.
{
"levels": ["orange", "purple"],
"enclaves": ["purple_E", "orange_E"],
"source_path": "./refactored",
"functions": [
{"name": "get_a", "level": "orange", "enclave": "orange_E", "file": "test1_refactored.c", "file_offset": 29},
{"name": "main", "level": "purple", "enclave": "orange_E", "file": "test1_refactored.c", "file_offset": 35},
// ...
],
"global_scoped_vars": [
{"name": "globalScopeVarNotFunctionStatic", "level": "purple", "file": "test1_refactored.c", "file_offset": 5},
// ...
],
// ...
}
Given the refactored, annotated application, and the topology, the
divider creates a divvied
directory, divides the code into
files in separate subdirectories (one per enclave), such that the source
code for each function or global variable is placed in its respective
enclave. Furthermore, all related code like type, variable, and function
declarations, macro definitions, header includes, and pragmas are
handled, so that the source in each directory has all the relevant code,
ready for automated partitioning and code generation for RPC-wrapping of
functions, and marshaling, tagging, serialization, and DFDL description
of cross-domain data types.
This divvied
source becomes the input to the GAPS
Enclave Definition Language (GEDL) generator tool. The GEDL drives
further code generation and modification needed to build the application
binaries for each enclave.
The usage of the program divider is straightforward:
divider_plugin [-h] TOPOLOGYJSON [--output-dir=OUTPUT_DIR] [--extra-arg=CLANG_ARGS]* --
The divider will always look in a directory named by the
source_path
field in the topology JSON. For example, in the
sample above, it is called refactored
under the working
directory for any *.cpp
, *.c
,
*.hpp
or *.h
files that may comprise the
program. The –extra-arg option can be repeated multiple times to specify
clang flags, e.g. header include paths.
The CLOSURE C++ RPC generation is a manual process in this state of the prototype toolchain. Future work would build upon the corresponding tools developed for the C toolchain which generates the following artifacts:
GEDL, a JSON document specifying all the cross domain calls and
their associated data including the arguments and return type. This gedl
is generated in an llvm opt
pass which analyzes the divvied
code.
IDL, a text file with a set of C style struct data type
definitions, used to facilitate subsequent serialization and marshaling
of data types. The IDL syntax is based on C; an IDL file contains one or
more C struct datatypes. Two structs are generated for each TAG request
and response pair, with the in arguments in the request and the out
arguments in the response. The idl_generator
script takes a
gedl JSON and produces the idl.
CODEC. For each struct in the IDL, a set of coding and decoding
functions is generated to facilitate serialization to the remote
enclave. The codecs consist of encode, decode, print functions for each
of the structs, which handle byte order conversions between host and
network byte order. The codecs can be generated using
hal_autogen
.
RPC, remote procedure code that automates cross domain function invocation and replaces calls to those functions with ones that additionally marshall and send the data across the network using the generated codecs and IDL.
DFDL, is an extension of XSD which provides a way to describe
binary formats and easily encode/decode from binary to an xml infoset.
CLOSURE has the ability to create DFDL schemas for each cross domain
request/response pair with use of the hal_autogen
.
HAL Configurations. The hal_autoconfig.py
produces a
HAL-Daemon configuration.
In the meantime, we produce some of these manually to facilitate experimentation. See the C++ examples for details.
To work with the richer language constructs in C++, the RPC generator needs significant enhancement. It is expected that it needs to play the role of the AspectJ compiler in the Java toolchain. In addition, the CLOSURE library classes such as HalZmq and ClosureShadow and others will need to be implemented correspondingly in the C++ toolchain.
The general idea of marshalling/unmarshalling and
serialization/deserialization of cross domain constructs remains the
same as in the C toolchain. However, the current
idl_generator
and the hal_autogen
scripts can
only handle a flattened structure containing primitive types. They need
to be extended to handle complex data types such as classes and nested
structures. A depth-first tree traversal algorithm needs to be
implemented to serialize and deserialize nested structures. For
typedefs, the toolchain needs to take advantage of type information in
llvm opt, and makes the actual types available to in gedl. Tools in the
later phase are not expected to deal with typedefs directly.
The CLOSURE C++ toolchain’s interface to the HAL daemon combines the approaches used in the CLOSURE C and Java toolchains.
With the CLOSURE C toolchain [10], partitioned application programs use the xdcomms API library to interface with the HAL daemon which exchanges data through the GAPS Devices.
In the Java toolchain, there is no separate xdcomms API library, and the interaction with the HAL daemon is handled by the code generated by the CodeGenJava tool. AspectJ-woven applications interact with the HAL through a CLOSURE library class. Details of marshaling/unmarshalling, serialization/de-serialization, and read/write to HAL, are abstracted and hidden from the applications
The C++ toolchain reuses the xdcomm API library to interface with the HAL daemon for data transport. For object-oriented features, such as object instantiation and method calls, since there is no reflection in C++ and AspectJ compiler, tools used in the C toolchain need to be modified and enhanced to automatically generate necessary code to work with C++ programs. This part is still in progress. In the meantime, see C++ example1 for code manually written to handle the HAL interface.
Example 1 is a simple example designed to gain insights and facilitate early development of CLOSURE toolchain for C++ programs. The targeted C++ features of the example are object instantiation and method calls, which, after partitioned by CLOSURE, are expected to be cross domain ones.
In this example, there are two classes Foo
and
Bar
with no inheritance relationship between them. The
annotation intention is to have Bar
be on an enclave named
“purple”, and Foo
to be present in both the “orange” and
“purple” enclave. We pin the main
function to a single
enclave by annotating a variable inside it (foo
) to be in
orange. We annotate Bar
with “PURPLE_SHAREABLE”, signifying
that all instances of Bar
must be in the purple enclave and
have the “PURPLE_SHAREABLE” label. Note that the rettaint of
Bar
’s constructor matches that of the class definition.
Bar::getValue
is a cross domain function returning
Bar::b
, which is shareable to the “orange” enclave. When
partitioned, the call to bar.getValue()
in
main
will become an RPC call.
#include "Bar.hpp"
#pragma cle def ORANGE {"level":"orange"}
class Foo
{
public:
int a;
int getValue()
{
return a;
}
() {
Foo= 1;
a }
};
int main(int argc, char **argv)
{
__attribute__((cle_annotate("ORANGE")))
= Foo();
Foo foo = Bar();
Bar bar int ret = foo.getValue() + bar.getValue();
// printf("Result is: %d\n", ret);
return ret;
}
#ifndef _BAR_
#define _BAR_
#pragma cle def PURPLE {"level":"purple"}
#pragma cle def PURPLE_SHAREABLE {"level":"purple",\
"cdf": [\
{"remotelevel":"orange", \
"direction": "egress", \
"guarddirective": { "operation": "allow"}}\
] }
#pragma cle def BAR_GETVALUE {"level":"purple",\
"cdf": [\
{"remotelevel":"orange", \
"direction": "bidirectional", \
"guarddirective": { "operation": "allow"}, \
"argtaints": [["PURPLE_SHAREABLE"]], \
"codtaints": ["PURPLE_SHAREABLE"], \
"rettaints": ["PURPLE_SHAREABLE"] \
}, \
{"remotelevel":"purple", \
"direction": "bidirectional", \
"guarddirective": { "operation": "allow"}, \
"argtaints": [["PURPLE_SHAREABLE"]], \
"codtaints": ["PURPLE_SHAREABLE"], \
"rettaints": ["PURPLE_SHAREABLE"] \
} \
] }
#pragma cle def BAR {"level":"purple",\
"cdf": [\
{"remotelevel":"orange", \
"direction": "bidirectional", \
"guarddirective": { "operation": "allow"}, \
"argtaints": [], \
"codtaints": ["PURPLE_SHAREABLE"], \
"rettaints": ["PURPLE_SHAREABLE"] \
}, \
{"remotelevel":"purple", \
"direction": "bidirectional", \
"guarddirective": { "operation": "allow"}, \
"argtaints": [], \
"codtaints": ["PURPLE_SHAREABLE"], \
"rettaints": ["PURPLE_SHAREABLE"] \
} \
] }
class __attribute__((cle_annotate("PURPLE_SHAREABLE"))) Bar
{
public:
int __attribute__((cle_annotate("PURPLE_SHAREABLE"))) b;
int c; // Assigned PURPLE or PURPLE_SHAREABLE
int __attribute__((cle_annotate("BAR_GETVALUE"))) getValue()
{
return b;
}
__attribute__((cle_annotate("BAR"))) Bar() {
= 2;
b }
};
#endif /* _BAR_ */
Example 1 can be built and analyzed using the familiar VSCode build tasks from the C Toolchain.
cd build/apps/examples/classes/example1/
code .
Use Ctrl-Shift-B to obtain the build tasks panel and run each task in numerical order.
The CLOSURE C++ prototype release supports up to and including the divider step for C++ programs featuring:
In the future, we would like to support more of C++98 and beyond, including but not limited to the following:
Other recommended improvements:
The topology.json
file is a description of level and
enclave assignments produced by the conflict analyzer and is used as input for
the code generator. It also contains information
about the callee and caller that will be in the cut.
The topology.json
contains:
The topology.json
manually generated for example1 is as
follows:
{
"enclaves": [
{
"name": "orange_E",
"level": "orange",
"assignedClasses": [
"Foo"
]
},
{
"name": "purple_E",
"level": "purple",
"assignedClasses": [
"Bar",
"Foo"
]
}
],
"levels": [
"purple",
"orange"
],
"functions": [
{
"name": "main",
"level": "orange",
"enclave": "orange_E"
}
],
"global_scoped_vars": [
],
"source_path": "/workspaces/cpp-closure/build/apps/examples/classes/example1/annotated"
}
The following contains type declarations for the MiniZinc model used within the conflict analyzer. These type declarations, along with a model instance generated in python are inputted to MiniZinc with the constraints to either produce a satisfiable assignment or some minimally unsatisfiable set of constraints.
%%%%%%%%%%%%%%%
% Graph nodes %
%%%%%%%%%%%%%%%
set of int: Decl_Var; % A variable declaration. It will be marked with global/static property
set of int: Decl_Function; % A static global variable inside a function
set of int: Decl_Record; % A class, struct or union definition
set of int: Decl_Field; % A record field
set of int: Decl_Method; % A record method. May have virtual/abstract property
set of int: Decl_Param; % A formal parameter of a function/method. Contains information about parameter index
set of int: Decl_Constructor; % A constructor for a class/struct
set of int: Decl_Destructor; % A destructor for a class/struct
set of int: Stmt_Decl; % A statement which also has a declaration
set of int: Stmt_Call; % A clang statement that represents a call to a function, method, constructor or destructor
set of int: Stmt_Compound; % A clang statement that only has other statements as children
set of int: Stmt_Ref; % A clang statement which holds a reference to a decl
set of int: Stmt_Field; % A clang statement which is a field accessor
set of int: Stmt_This; % A clang statement which refers to 'this'
set of int: Stmt_Return; % A return within a given function
set of int: Stmt_Other; % A clang statement that does not transfer control to another function
set of int: NodeIdx; % Spans all nodes
% Aggregate to unify constraints on functions, constructors, and methods
set of int: FunOrMethodOrCon = Decl_Function union Decl_Constructor union Decl_Method union Decl_Destructor;
%%%%%%%%%%%%%%%
% Graph edges %
%%%%%%%%%%%%%%%
set of int: Struct_Field; % Connects a record to its field | Decl.Record -> Decl.Field
set of int: Struct_Method; % Connects a record to its method | Decl.Record -> Decl.Method
set of int: Struct_Constructor; % Connects a record to its constructor | Decl.Record -> Decl.Constructor
set of int: Struct_Destructor; % Connects a record to its destructor | Decl.Record -> Decl.Destructor
set of int: Struct_Inherit; % Connects two records by inheritance relation | Decl.Record -> Decl.Record
set of int: Struct_Param; % Connects a function-like object to its parameter | Decl -> Decl.Param
set of int: Struct_Child; % When a statement has a child not described by the above | Stmt -> Stmt
set of int: Control_Return; % A function/method/constructor/destructor return | Stmt.Return -> Stmt.Call
set of int: Control_Entry; % Connects a function-like object to it's body | Decl -> Stmt
set of int: Control_FunctionInvocation; % A direct call invocation to a C-style function | Stmt.Call -> Decl.Function
set of int: Control_MethodInvocation; % A method call | Stmt.Call -> Decl.Method
set of int: Control_ConstructorInvocation; % A constructor call | Stmt.Call -> Decl.Constructor
set of int: Control_DestructorInvocation; % A call to a destructor | Stmt.Call -> Decl.Destructor
set of int: Data_PointsTo; % A points-to relation | Decl -> Decl
set of int: Data_DefUse; % A def-use relation between statements | Stmt.Decl -> Stmt.Ref
set of int: Data_ArgPass; % An argument pass to a call instruction | Stmt.Call -> Stmt
set of int: Data_Return; % Connects data being returned to the call site | Stmt -> Stmt.Call
set of int: Data_Object; % Connects a method call/field to the object being invoked | Stmt.Call/Stmt.Field -> Stmt
set of int: Data_FieldAccess; % A field access | Stmt -> Decl.Field
set of int: Data_InstanceOf; % Connects a 'this' instance to a class definition | Stmt.This -> Decl.Record
set of int: Data_Decl; % Connects a decl statment to a decl | Stmt.Decl -> Decl
set of int: EdgeIdx; % Spans all data and control edges
set of int: DataEdge =
Data_PointsTo union Data_DefUse union Data_ArgPass union
Data_Return union Data_Object union Data_FieldAccess union
Data_InstanceOf union Data_Decl;
% Aggregates of enclave-safe and invocation edges
set of int: Control_EnclaveSafe =
Struct_Field union Struct_Method union Struct_Constructor union
Struct_Destructor union Struct_Inherit union Control_Entry union Struct_Child union Struct_Param;
set of int: Control_Invocation =
Control_FunctionInvocation union Control_MethodInvocation union
Control_ConstructorInvocation union Control_DestructorInvocation;
set of int: Data_EnclaveSafe =
Data_FieldAccess union Data_InstanceOf union
Data_Object union Data_DefUse union Data_Decl union Data_ArgPass;
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Containment mappings for nodes, edge endpoints, parameter indices, and node properties %
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
array[NodeIdx] of bool: isGlobal;
array[NodeIdx] of int: hasClass; % 0 for node not in a class, otherwise maps to the Decl_Record containing the node
array[NodeIdx] of bool: userAnnotatedNode;
array[NodeIdx] of int: hasFunction; % 0 for globals, otherwise maps to the Decl_Function/Method/Constructor containing the node
array[EdgeIdx] of int: hasSource;
array[EdgeIdx] of int: hasDest;
array[NodeIdx] of int: hasParamIdx;
% Note: if n is a Decl_Record (i.e. a class declaration), then hasClass[n] == 0 if the class is not contained in another class,
% but if n is a Decl_Function then hasFunction[n] == n (i.e. a function contains itself but a class does not)
%%%%%%%%%%%%%%%%%%%
% CLE input model %
%%%%%%%%%%%%%%%%%%%
enum Level;
enum Enclave;
array[Enclave] of Level: hasEnclaveLevel;
enum cleLabel;
enum cdf;
enum GuardOperation = {nullGuardOperation, allow, deny, redact};
enum Direction = {nullDirection, bidirectional, egress, ingress};
int: MaxFnParams; % Upper bound on number of parameters for any function in the program
set of int: paramIdx = 1..MaxFnParams;
array[cleLabel] of Level: hasLabelLevel;
array[cleLabel] of bool: isFunctionAnnotation;
array[cdf] of cleLabel: fromCleLabel;
array[cdf] of Level: hasRemotelevel;
array[cdf] of GuardOperation: hasGuardOperation;
array[cdf] of Direction: hasDirection;
array[cdf] of bool: isOneway;
array[cleLabel, Level] of cdf: cdfForRemoteLevel;
array[cdf, cleLabel] of bool: hasRettaints;
array[cdf, cleLabel] of bool: hasCodtaints;
array[cdf, paramIdx, cleLabel] of bool: hasArgtaints;
array[cdf, cleLabel] of bool: hasARCtaints;
%%%%%%%%%%%%%%%%%%%%%%
% Decision variables %
%%%%%%%%%%%%%%%%%%%%%%
array[NodeIdx] of var Enclave: nodeEnclave;
array[NodeIdx] of var cleLabel: taint;
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Utility functions and predicates %
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
predicate isFunction(NodeIdx: n) = (n in FunOrMethodOrCon);
predicate isClass(NodeIdx: n) = (n in Decl_Record);
predicate allowOrRedact(var cdf: c) = (hasGuardOperation[c] == allow \/ hasGuardOperation[c] == redact);
predicate annotFun(NodeIdx: n) = hasFunction[n] != 0 /\ userAnnotatedNode[hasFunction[n]];
predicate sourceAnnotFun(EdgeIdx: e) =
(if hasFunction[hasSource[e]] != 0 then userAnnotatedNode[hasFunction[hasSource[e]]] else false endif);
predicate destAnnotFun(EdgeIdx: e) =
(if hasFunction[hasDest[e]] != 0 then userAnnotatedNode[hasFunction[hasDest[e]]] else false endif);
predicate isInArctaint(var cleLabel: fan, var cleLabel: tnt, var Level: lvl) =
(if isFunctionAnnotation[fan] then hasARCtaints[cdfForRemoteLevel[fan, lvl], tnt] else false endif);
predicate xdedge(EdgeIdx: e) = (nodeEnclave[hasSource[e]] != nodeEnclave[hasDest[e]]) /\ nodeEnclave[hasSource[e]] != all_E /\ nodeEnclave[hasDest[e]] != all_E;
predicate taintsAgree(var cleLabel: t1, var cleLabel: t2) = t1 == t2 \/ t1 == ALL \/ t2 == ALL;
predicate child(NodeIdx: n, NodeIdx: c) = exists (e in Struct_Child) (hasSource[e] == n /\ hasDest[e] == c);
predicate ancestor(NodeIdx: n, NodeIdx: c) = child(n, c) \/ exists (d in NodeIdx) (child(n, d) /\ ancestor(d, c));
function var cleLabel: esTaint(EdgeIdx: e) = taint[hasSource[e]];
function var cleLabel: edTaint(EdgeIdx: e) = taint[hasDest[e]];
function var cleLabel: esFunTaint(EdgeIdx: e) = if sourceAnnotFun(e) then taint[hasFunction[hasSource[e]]] else nullCleLabel endif;
function var cleLabel: edFunTaint(EdgeIdx: e) = if destAnnotFun(e) then taint[hasFunction[hasDest[e]]] else nullCleLabel endif;
function var cdf: funCdf(NodeIdx: n) = if annotFun(n) then cdfForRemoteLevel[ftaint(n), hasLabelLevel[taint[n]]] else nullCdf endif;
function var cdf: esFunCdf(EdgeIdx: e) = if sourceAnnotFun(e) then cdfForRemoteLevel[esFunTaint(e), hasLabelLevel[edTaint(e)]] else nullCdf endif;
function var cdf: edFunCdf(EdgeIdx: e) = if destAnnotFun(e) then cdfForRemoteLevel[edFunTaint(e), hasLabelLevel[esTaint(e)]] else nullCdf endif;
function var cleLabel: ftaint(NodeIdx: n) = if hasFunction[n] != 0 then taint[hasFunction[n]] else nullCleLabel endif;
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Structural constraints on node enclave/level %
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Nodes must have a non-null enclave which is shared within a function and matches their level
constraint :: "NodeHasEnclave" forall (n in NodeIdx) (nodeEnclave[n] != nullEnclave);
constraint :: "NodeEnclaveIsFunEnclave" forall (n in NodeIdx) ((hasFunction[n] != 0) -> (nodeEnclave[n] == nodeEnclave[hasFunction[n]]));
constraint :: "NodeEnclaveIsClassEnclave" forall (n in NodeIdx) ((hasClass[n] != 0) -> (nodeEnclave[n] == nodeEnclave[hasClass[n]]));
constraint :: "NodeLevelAtEnclaveLevel" forall (n in NodeIdx) (hasLabelLevel[taint[n]] == hasEnclaveLevel[nodeEnclave[n]]);
%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Classifying annotations %
%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Annotations on functions, methods, and constructors are function annotations and can only be made by the user.
% No other node should have a function annotation.
constraint :: "FnAnnotationForFnOnly"
forall (n in NodeIdx) (isFunctionAnnotation[taint[n]] -> isFunction(n));
constraint :: "FnAnnotationByUserOnly"
forall (n in FunOrMethodOrCon) (isFunctionAnnotation[taint[n]] -> userAnnotatedNode[n]);
constraint :: "annotationOnFunctionIsFunAnnotation"
forall (n in FunOrMethodOrCon) (userAnnotatedNode[n] -> isFunctionAnnotation[taint[n]]);
% Annotations on classes and their fields are node annotations, not function annotations.
constraint :: "annotationOnClassIsNodeAnnotation"
forall (n in Decl_Record) (userAnnotatedNode[n] -> not (isFunctionAnnotation[taint[n]]));
constraint :: "annotationOnFieldIsNodeAnnotation"
forall (n in Decl_Field) (userAnnotatedNode[n] -> not (isFunctionAnnotation[taint[n]]));
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Structural taint relationships between nodes %
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% All nodes in an un-annotated function must have the taint of the function
constraint :: "UnannotatedFunContentTaintMatch"
forall (n in NodeIdx where hasFunction[n] != 0) ((not userAnnotatedNode[hasFunction[n]]) -> (taint[n] == ftaint(n)));
% All nodes in an un-annotated class must have the taint of the class
% (may be ALL if multiple taints construct the class, but not if the class accesses tainted data)
constraint :: "UnannotatedClassTaintsMatch"
forall (n in NodeIdx where hasClass[n] != 0)
((not userAnnotatedNode[hasClass[n]]) -> (taint[n] == taint[hasClass[n]]));
% All nodes in an un-annotated class must be un-annotated
constraint :: "noAnnotatedDataForUnannotatedClass"
forall (n in NodeIdx where hasClass[n] != 0)
((not userAnnotatedNode[hasClass[n]]) -> (not userAnnotatedNode[n]));
% Un-annotated constructors get the class taint
constraint :: "unannotatedConstructorGetsClassTaint"
forall (n in Decl_Constructor)
((hasClass[n] != 0 /\ not userAnnotatedNode[n]) -> (taint[n] == taint[hasClass[n]]));
% Un-annotated destructors get the class taint
constraint :: "unannotatedDestructorGetsClassTaint"
forall (n in Decl_Destructor)
((hasClass[n] != 0 /\ not userAnnotatedNode[n]) -> (taint[n] == taint[hasClass[n]]));
% Un-annotated methods get the class taint
constraint :: "unannotatedMethodGetsClassTaint"
forall (n in Decl_Method)
((hasClass[n] != 0 /\ not userAnnotatedNode[n]) -> (taint[n] == taint[hasClass[n]]));
% Classes connected by an inheritance relationship share the same taint
constraint :: "inheritTaint"
forall (e in Struct_Inherit) (esTaint(e) == edTaint(e));
% All nodes in an annotated function must have a taint in the ARCtaints
constraint :: "AnnotatedFunContentCoercible"
forall (n in NodeIdx where (hasFunction[n] != 0) /\ (not isFunction(n)))
(userAnnotatedNode[hasFunction[n]] -> isInArctaint(ftaint(n), taint[n], hasLabelLevel[taint[n]]));
% Annotated constructors (which are only valid in annotated classes) must have the class taint
% as their sole rettaint in every CDF
% TODO: doesn't validate that it is the only rettaint
constraint :: "annotatedConstructorReturnsClassTaint"
forall (n in Decl_Constructor)
(userAnnotatedNode[n] ->
(forall (lvl in Level)
((cdfForRemoteLevel[taint[n], lvl] != nullCdf) -> hasRettaints[cdfForRemoteLevel[taint[n], lvl], taint[hasClass[n]]])));
% Any function whose address is taken in the program cannot have a function annotation
constraint :: "FunctionPtrSinglyTainted"
forall (e in Data_PointsTo) (isFunction(hasDest[e]) -> not userAnnotatedNode[hasDest[e]]);
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Control never leaves enclave except via valid XDC %
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
constraint :: "NonCallRetControlEnclaveSafe"
forall (e in Control_EnclaveSafe) (xdedge(e) == false);
constraint :: "XDCallBlest"
forall (e in Control_Invocation) (xdedge(e) -> userAnnotatedNode[hasDest[e]]);
constraint :: "XDCallAllowed"
forall (e in Control_Invocation) (xdedge(e) -> allowOrRedact(edFunCdf(e)));
% constraint :: "XDReturnAllowed"
% forall (e in Control_Return) (xdedge(e) -> allowOrRedact(esFunCdf(e)));
% caller -[Control_Invocation]-> callee <-[hasFunction]- Stmt_Return = Return Edge
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Data never leaves enclave except via parameters or return for valid XDC %
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
constraint :: "EnclaveSafeDataEdges"
forall (e in Data_EnclaveSafe) (xdedge(e) == false);
% Cross-domain returns must be allowed by the CDF, but the taints need not match
% (can't enforce taint matching as the level must change).
% constraint :: "XDReturnDataAllowed"
% forall (e in Data_Return)
% (xdedge(e) -> allowOrRedact(esFunCdf(e)));
% All children of returns must have the same taint as the return stmt
constraint :: "ReturnChildMatchesReturn"
forall (n in Stmt_Return) (
forall (c in NodeIdx) (ancestor(n, c) -> taint[n] == taint[c])
);
% Return statments in annotated functions must have a taint
% which is one of the rettaints
constraint :: "ReturnStmtMatchesRettaint"
forall (n in Stmt_Return where annotFun(n)) (
hasRettaints[funCdf(n), taint[n]]
);
% For any function called cross-domain, all its returns
% must have taints shareable with the callee level
constraint :: "XDReturnAllowed"
forall (e in Control_Invocation where xdedge(e)) (
forall (r in Stmt_Return where hasFunction[r] == hasDest[e])
(allowOrRedact(cdfForRemoteLevel[taint[r], hasLabelLevel[esTaint(e)]]))
);
% Cross-domain Points-To edges must be allowed by the CDF, but the taints need not match.
% Note: The reason cross domain points-to edges are not disallowed completely
% is because passing arrays necessitates cross-domain points-to edges.
constraint :: "XDPointsToAllowed"
forall (e in Data_PointsTo)
(xdedge(e) -> (allowOrRedact(cdfForRemoteLevel[edTaint(e), hasLabelLevel[esTaint(e)]]) /\ not isFunction(hasDest[e])));
%%%%%%%%%%%%%%%%%%%%%
% Taint propagation %
%%%%%%%%%%%%%%%%%%%%%
% Data edges within un-annotated functions are fully constrained by UnannotatedFunContentTaintMatch,
% because the source and destination are in the same function.
% Data edges within annotated functions are partially constrained by AnnotatedFunContentCoercible,
% so we have that the source and destination must be in the function's ARCtaints.
% Sometimes, however, pointer dependencies are captured by a chain
% of points-to edges which may be intra-function edges. Therefore
% we restrict intra-function points-to edges to have the
% same taint, even in annotated functions.
predicate intraFunEdge(DataEdge: e) =
(hasFunction[hasSource[e]] != 0 /\ hasFunction[hasDest[e]] != 0 /\
hasFunction[hasSource[e]] == hasFunction[hasDest[e]]);
constraint :: "intraFunPointsToTaintsMatch"
forall (e in Data_PointsTo) (intraFunEdge(e) -> esTaint(e) == edTaint(e));
% For ANY data edge between two function-external nodes, the taints must match
predicate globalGlobalEdge(DataEdge: e) = (isGlobal[hasSource[e]] /\ isGlobal[hasDest[e]]);
constraint :: "externExternDataEdgeTaintsMatch"
forall (e in DataEdge) (globalGlobalEdge(e) -> taintsAgree(esTaint(e), edTaint(e)));
% For ANY data edge between data within a function and a function-external node, the taints must match
predicate srcFunExternEdge(DataEdge: e) = (hasFunction[hasSource[e]] != 0 /\ hasFunction[hasDest[e]] == 0);
predicate destFunExternEdge(DataEdge: e) = (hasFunction[hasDest[e]] != 0 /\ hasFunction[hasSource[e]] == 0);
predicate funExternEdge(DataEdge: e) = (srcFunExternEdge(e) \/ destFunExternEdge(e));
constraint :: "externDataEdgeTaintsMatch"
forall (e in DataEdge) (funExternEdge(e) -> taintsAgree(esTaint(e), edTaint(e)));
% This leaves taint propagation for inter-function data edges.
% For non-XD return edges from a callee function to a callsite:
% -If the callee (source) function is un-annotated, the taints must match
% -If the callee (source) function is annotated, the taint on the dest node
% must be in the callee's rettaints
% constraint :: "retEdgeFromUnannotatedTaintsMatch"
% forall (e in Data_Return where not sourceAnnotFun(e))
% (taintsAgree(esTaint(e), edTaint(e)));
% constraint :: "returnNodeInRettaints"
% forall (e in Data_Return where sourceAnnotFun(e) /\ not xdedge(e))
% (hasRettaints[esFunCdf(e), edTaint(e)]);
% The taint of the returned value of a call to an unannotated function must match the taint
% of the unannotated function
constraint :: "returnFromUnannotatedTaintsMatch"
forall (e in Control_Invocation where not destAnnotFun(e))
(taintsAgree(esTaint(e), edTaint(e)));
% The taint of the returned value of a (not xd) call to an annotated function
% must match one of the rettaints
constraint :: "returnInRettaints"
forall (e in Control_Invocation where destAnnotFun(e) /\ not xdedge(e))
(hasRettaints[edFunCdf(e), esTaint(e)]);
% The taint of the returned value of a (not xd) call to an annotated function
% must match the taints of all of the callee's returns
constraint :: "callsiteTaintMatchesAllReturns"
forall (e in Control_Invocation where destAnnotFun(e) /\ not xdedge(e))
(forall (r in Stmt_Return where hasFunction[r] == hasDest[e])
(taintsAgree(taint[r], esTaint(e))));
% ArgPass edges: The callsite node (Stmt_Call) has ArgPass edges to each of its arguments, and call invocation edges
% to each potential callee. The ParamIdx is on the argpass edge destination and the Decl_Param (unused).
% For non-XD argument passing edges:
% -If the destination function is un-annotated, the taint of the argument must match the taint of the destination function
% -If the destination function is annotated, the taint of the argument must be in the argtaints of the function at that parameter index
constraint :: "argumentToUnannotatedTaintsMatch"
forall (e in Control_Invocation where not destAnnotFun(e))
(forall (arg_e in Data_ArgPass where hasSource[arg_e] == hasSource[e])
(taintsAgree(edTaint(arg_e), edTaint(e))));
constraint :: "argumentInArgtaints"
forall (e in Control_Invocation where destAnnotFun(e) /\ not xdedge(e))
(forall (arg_e in Data_ArgPass where hasSource[arg_e] == hasSource[e])
(hasArgtaints[edFunCdf(e), hasParamIdx[hasDest[arg_e]], taint[hasDest[arg_e]]]));
% All children of anything passed as an argument must have
% the same taint as the argument node
constraint :: "argumentChildren"
forall (e in Data_ArgPass) (
forall(n in NodeIdx where ancestor(hasDest[e], n))
(taint[n] == edTaint(e))
);
% Inter-function points-to edges should always have the same taint
predicate interFunEdge(DataEdge: e) =
(hasFunction[hasSource[e]] != 0 /\ hasFunction[hasDest[e]] != 0 /\
hasFunction[hasSource[e]] != hasFunction[hasDest[e]]);
constraint :: "interFunPointsToTaintsMatch"
forall (e in Data_PointsTo)
((not xdedge(e) /\ interFunEdge(e)) -> taintsAgree(esTaint(e), edTaint(e)));
%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Enabling shared classes %
%%%%%%%%%%%%%%%%%%%%%%%%%%%
% A class is pinned if:
% - the class is annotated
% - anything in the class is annotated
% - it depends on anything external to the class
% Note: relies on all edges following a "source depends on destination" convention
% (as opposed to "destination depends on source"). Can be reversed, or specialized per edge type.
% TODO: this will pin an inherited class, when we may want both to be unpinned. Exclude the Struct_Inherit edge?
% TODO: this may be an expensive predicate. It can be pre-computed and placed into an array. Not sure what the backend does with this.
predicate isPinned(Decl_Record: r) = (userAnnotatedNode[r] \/
(exists (n in NodeIdx) (hasClass[n] == r /\ userAnnotatedNode[n])) \/
(exists (e in EdgeIdx) (hasClass[hasSource[e]] == r /\ hasClass[hasDest[e]] != r)));
% A node has the ALL label if and only if it is contained in an unpinned class
constraint :: "definitionForALL"
forall (n in NodeIdx)
(((hasClass[n] != 0 /\ not isPinned(hasClass[n])) \/ (isClass(n) /\ not isPinned(n))) == (taint[n] == ALL));
See the C documentation [10].
The following dockerfile is used to build a source release from scratch.
FROM ubuntu:focal
ARG DEBIAN_FRONTEND=noninteractive
RUN apt -y update
# general prereqs
RUN apt -y install \
\
make \
git \
libz3-dev \
sudo \
cmake \
python3 \
python3-pip
unzip
# python dependencies
RUN pip install \
\
intervaltree \
pytest
z3-solver
# llvm-14 prereqs
RUN apt -y install \
\
lsb-release \
wget \
software-properties-common
gnupg
# install llvm/clang 14 with API
RUN cd /tmp \
&& wget https://apt.llvm.org/llvm.sh \
&& chmod +x llvm.sh \
&& ./llvm.sh 14 all
RUN apt install -y \
\
clang-14 \
libclang-14-dev
llvm-14
# link clang-14 binaries
RUN update-alternatives --install /usr/bin/c++ c++ /usr/bin/clang++-14 60 \
&& update-alternatives --install /usr/bin/clang clang /usr/bin/clang-14 60 \
&& update-alternatives --install /usr/bin/opt opt /usr/bin/opt-14 60
# install minizinc
RUN cd /opt/ \
&& sudo mkdir minizinc \
&& sudo wget https://github.com/MiniZinc/MiniZincIDE/releases/download/2.5.5/MiniZincIDE-2.5.5-bundle-linux-x86_64.tgz \
&& sudo tar -xvzf MiniZinc* -C minizinc --strip-components=1
# make closure user
RUN useradd -ms /bin/bash closure \
&& usermod -aG sudo closure \
&& echo "%sudo ALL=(ALL) NOPASSWD:ALL" >> /etc/sudoers
USER closure
ENV HOME /home/closure
# copy and build closure
COPY --chown=closure . /tmp/cpp-closure
RUN cd /tmp/cpp-closure \
&& sudo mkdir /opt/closure \
&& sudo chown closure /opt/closure \
&& BUILD_DIR=/tmp/cpp-closure/tmp-build INSTALL_DIR=/opt/closure make install
ENV PATH=/opt/minizinc/bin:/opt/closure/bin:$PATH