Feat/add automatic rounding #558

andrei-stoian-zama · 2024-03-25T20:48:50Z

Adds automatic rounding mechanism.

Go through a graph and finds TLUs subgraph nodes.
Analyze the outputs of those TLUs on the calibrated input range.
Identify the intervals (step sizes) where the TLU is constant and the offset where the intervals start
Raises the precision of TLU inputs to map the step sizes to a power of two
Modifies the Graph and the TLU to include rounding and cancel out the raised precision in the TLU
Applies the previous steps on all TLUs for each unique dimension of TLU output (like apply_mapped_lookup_table)
Find the best rounding configuration by taking the one with lowest MSE reconstruction with respect to the original TLU

tests/parameter_search/test_automatic_rounding.py

andrei-stoian-zama · 2024-03-27T20:12:15Z

Makefile

@@ -54,7 +54,6 @@ setup_env:
 	echo "Finished installing poetry lock."

 	echo "Installing $(CONCRETE_PYTHON_VERSION)" && \
-	poetry run python -m pip install -U --pre "$(CONCRETE_PYTHON_VERSION)"


I have no idea why this ishere

andrei-stoian-zama · 2024-03-27T20:12:26Z

conftest.py

+        f"# {randomly_seed} # {str(request.node.own_markers)}"
+    )
+
+    print(f"{derivation_string=}")


to be removed

github-actions · 2024-03-29T00:18:14Z

Coverage failed ❌

Coverage details

---------- coverage: platform linux, python 3.8.18-final-0 -----------
Name                                      Stmts   Miss  Cover   Missing
-----------------------------------------------------------------------
src/concrete/ml/common/preprocessors.py     329     22    93%   101, 108, 135, 147-148, 206, 308, 378, 455, 674-681, 779, 860-863, 941-942
src/concrete/ml/pytest/torch_models.py      645      5    99%   1628-1629, 1635-1636, 1647
-----------------------------------------------------------------------
TOTAL                                      7511     27    99%

51 files skipped due to complete coverage.

RomanBredehoft · 2024-03-29T10:15:01Z

Makefile

@@ -9,6 +9,7 @@ SRC_DIR:=src
 TEST?=tests
 N_CPU?=4
 CONCRETE_PACKAGE_PATH=$(SRC_DIR)/concrete
+SPHINX_APIDOC_EXCLUDE=$(SRC_DIR)/concrete/ml/common/preprocessors.py


what is this exactly ?

for some reason sphinx was crashing for this file

this can now be removed since we don't have sphinx anymore

RomanBredehoft · 2024-03-29T10:16:05Z

src/concrete/ml/common/preprocessors.py

+
+    for node in sorted_nodes:
+        if node.operation == Operation.Input:
+            # Deepcopy on the fhe.Graph doesn't modify the input node/indices mappings


not sure what this comment is about

RomanBredehoft · 2024-03-29T10:16:38Z

src/concrete/ml/common/preprocessors.py

+    }
+    attributes = {}
+    if rounding_function.__name__ == "round_bit_pattern":
+        # These kwargs are not supported atm


again what is this comment ? should it require a fixme ?

RomanBredehoft · 2024-03-29T10:16:58Z

src/concrete/ml/common/preprocessors.py

+        rounding_node.properties["overflow_protection"] = overflow_protection
+        rounding_node.properties["exactness"] = exactness
+
+    rounding_node.bounds = a_node.bounds  # Might be over/under-estimated


again, should this comment be removed ?

RomanBredehoft · 2024-03-29T10:19:06Z

src/concrete/ml/common/preprocessors.py

@@ -0,0 +1,999 @@
+"""Graph pre-processors for automatic rounding."""


overall looks good but :

very few comments in the actual code, which makes the whole file hard to follow (cannot imagine debugging this in the future)

some remaining comments that are confusing / seem irrelevant

RomanBredehoft · 2024-03-29T10:19:42Z

src/concrete/ml/pytest/torch_models.py

+
+
+class TorchAutoRoundingTLUTester(nn.Module):
+    """A small quantized network with Brevitas, trained on make_classification."""


bad copy paste ?

RomanBredehoft · 2024-03-29T10:21:42Z

src/concrete/ml/quantization/quantized_ops.py

@@ -236,116 +236,7 @@ def q_impl(

        p = input2_q_values.shape[-2]

-        # Remove the manual matrix multiplication when we can handle input precision with rounding
-        # FIXME: https://github.com/zama-ai/concrete-internal/issues/512
-        def enc_mul(x, y):


is this expected ? we no longer need it ?

RomanBredehoft · 2024-03-29T10:22:56Z

tests/common/test_preprocessors.py

+    reference = f(input_set)
+
+    if function_name in ["staircase_pot", "staircase"] and execution_number == 0:
+        # TODO: round to 1b or eliminate these TLUs entirely


do we need a fixme here ?

RomanBredehoft · 2024-03-29T10:23:47Z

tests/common/test_preprocessors.py

+        [circuit_no_optim_no_rounding.simulate(numpy.array([elt])) for elt in input_set]
+    )[..., 0]
+
+    graph_res = numpy.array([circuit.graph(numpy.array([elt])) for elt in input_set])[..., 0]


do we still need to test the graph (old VL) ?

RomanBredehoft · 2024-03-29T10:24:39Z

tests/common/test_preprocessors.py

@@ -0,0 +1,198 @@
+"""Unit tests for TLU optimization preprocessors."""


a bit like above, looks great overall but very few (random) comments that makes the file hard to understand (several times I have no clue what is actually going on)

RomanBredehoft · 2024-03-29T10:26:21Z

tests/parameter_search/test_automatic_rounding.py

+
+    count_reinterpret = 0
+    count_tlu = 0
+    for line in model.quantized_module_.fhe_circuit.mlir.split("\n"):


should this be a function ? either in source (pytest) or in this file, but looks important

RomanBredehoft · 2024-03-29T10:30:22Z

tests/parameter_search/test_automatic_rounding.py

+        "max_epochs": 10,
+    }
+
+    model = NeuralNetClassifier(**params)


small detail but you could update instantiate_model_generic to be able to provide custom params and then use preamble , makes test easier to control

RomanBredehoft · 2024-03-29T10:31:12Z

use_case_examples/cifar/cifar_brevitas_training/evaluate_one_example_fhe.py

 CURRENT_DIR = Path(__file__).resolve().parent
 KEYGEN_CACHE_DIR = CURRENT_DIR.joinpath(".keycache")

 # Add MPS (for macOS with Apple Silicon or AMD GPUs) support when error is fixed. For now, we
 # observe a decrease in torch's top1 accuracy when using MPS devices
 # FIXME: https://github.com/zama-ai/concrete-ml-internal/issues/3953
 DEVICE = "cuda" if torch.cuda.is_available() else "cpu"
-NUM_SAMPLES = int(os.environ.get("NUM_SAMPLES", 1))
+
+NUM_SAMPLES = int(os.environ.get("NUM_SAMPLES", 1000 if SIMULATE_ONLY else 1))


1000 samples, even with simulate, looks a bit too much no ? why not keep 100 ?

RomanBredehoft · 2024-03-29T10:31:44Z

use_case_examples/cifar/cifar_brevitas_training/evaluate_one_example_fhe.py

 from concrete.ml.quantization import QuantizedModule
 from concrete.ml.torch.compile import compile_brevitas_qat_model

+SIMULATE_ONLY = True


should we make this an env var ?

RomanBredehoft · 2024-03-29T10:32:07Z

use_case_examples/cifar/cifar_brevitas_training/evaluate_one_example_fhe.py

@@ -76,11 +82,20 @@ def wrapper(*args, **kwargs):
 # cache generated keys through `insecure_key_cache_location`. As the name suggests, these
 # parameters are unsafe and should only be used for debugging in development
 # Multi-parameter strategy is used in order to speed-up the FHE executions
+base_configuration = Configuration()


to be removed ?

RomanBredehoft · 2024-03-29T10:32:31Z

use_case_examples/cifar/cifar_brevitas_training/evaluate_one_example_fhe.py

+import sys
+
+if sys.platform == "darwin":
+    print("skipping fhe evaluation on darwin platform")


why is that ?

RomanBredehoft

thanks for this ! overall it looks great, but we might want to clean a bit some stuff at one point: important files/code are missing soem comments to better explain what is going on (some parts are pretty obscure), several now-irrelevant comments to remove, some things could be simplified in tests

andrei-stoian-zama · 2024-04-24T08:35:13Z

src/concrete/ml/common/preprocessors.py

+            assert isinstance(evaluator, GenericEvaluator)
+
+            if "subgraph" not in evaluator.properties["kwargs"]:
+                # Not supported for now


when does this case arise ? is it for fhe.univariate use?

yes univariate or explicit table use in Concrete.

Anyway it's a good check since in the following we try to insert nodes in the subgraph

andrei-stoian-zama · 2024-05-07T07:21:16Z

Makefile

@@ -9,6 +9,7 @@ SRC_DIR:=src
 TEST?=tests
 N_CPU?=4
 CONCRETE_PACKAGE_PATH=$(SRC_DIR)/concrete
+SPHINX_APIDOC_EXCLUDE=$(SRC_DIR)/concrete/ml/common/preprocessors.py


this can now be removed since we don't have sphinx anymore

andrei-stoian-zama · 2024-05-07T07:21:23Z

Makefile

@@ -415,7 +416,7 @@ docs_no_links: clean_docs check_docs_dollars
 	mkdir -p docs/_static/
 	@# Generate the auto summary of documentations
 	@# Cannot do without specifying top module currently with sphinx-apidoc
-	poetry run sphinx-apidoc --implicit-namespaces -o docs/_apidoc $(CONCRETE_PACKAGE_PATH)
+	poetry run sphinx-apidoc --implicit-namespaces -o docs/_apidoc $(CONCRETE_PACKAGE_PATH) $(SPHINX_APIDOC_EXCLUDE)


andrei-stoian-zama · 2024-05-07T07:42:37Z

use_case_examples/cifar/cifar_brevitas_training/evaluate_one_example_fhe.py

 configuration = Configuration(
    dump_artifacts_on_unexpected_failures=False,
    enable_unsafe_features=True,
    use_insecure_key_cache=True,
    insecure_key_cache_location=KEYGEN_CACHE_DIR,
+    additional_pre_processors=[


why do we need this? we should implement compile_torch_model with rounding_threshold_bits: { n_bits=AUTO, method = approximate }

andrei-stoian-zama · 2024-05-07T07:43:00Z

use_case_examples/cifar/cifar_brevitas_training/evaluate_torch_cml.py

    cfg = Configuration(
        verbose=True,
        show_optimizer=args.show_optimizer,
+        additional_pre_processors=[


see above comment about the right API

fix: update to the new VL chore: do not test p_error on linear models chore: remove custom encrypted matmul chore: add simulation compilation chore: skip low bit-width rounding in tests chore: update concrete-python nightly chore: fix coverage chore: default to n_jobs = 1 for the CI chore: fix forbidden words feat: approximate rounding by default chore: remove cp install chore: update licenses for macos-latest-xl chore: experiment with 3-bit rounding exp: debugging Required to modified CP to set preprocessors before the other ones but it's compiling now. wip, it breaks exp: debugging exp: debugging exp: debugging exp: debugging exp: debugging exp: debugging exp: debugging exp: debugging exp: debugging exp: debugging feat: add rounding torch models tests chore: add crypto-param tests feat: more tests fix: reinstate ceiling bitwidth computation chore: refactor chore: fixing tests, some weird behaviors fix: add test and continue debugging debugging debugging debugging exp: debugging exp: debugging exp: debugging feat: add more tlu optimization tests exp: debugging feat: rework preprocessor fix: rework tlu optimization feat: fix automatic rounding bugs fix: remove all debugging in tests fix: make conformance fix: fix pylint and mypy fix: remove debug files fix: docstrings fix: pcc fix: cp installation fix: revert deps to defaults for new cp fix: revert deps to defaults for new cp fix: pylint and docs fix: tests fix: tests 2

fix: bad syntax fix: handle all shapes as input to subgraph

Tests are passing with a flaky. Not sure the works does what it's supposed to do fully. Need at least a re-check and probably some comments.

Still experimenting with it. In the current setting using our approach instead of the 6-bit rounding everywhere approach we degrade the top-1 accuracy of the CIFAR model too much (from 0.88 with the torch model, to 0.86 with rounding 6-bit to 0.79 for auto-rounding). I added a notebook to visualize what happens with the auto-rounding optimization on a simple setting. Things look fishy to me. Let's keep iterating on it until we find an appropriate setting that works on the CIFAR-10 use-case and at least another use-case.

Still a work in progress. I added a small script to help debug the pre-processor. A few things remain: - I often have a right or left offset of 1 on the steps functions - Single jump TLUs are still buggy due to the right value being taken not being the expected one - Some situations were Concrete does not agree with Numpy - On CIFAR some TLUs have one more bit than expected.

All tests passed once. Main improvement was a change on the closed form used for the bias. There is probably still something to improve there. I also added clipping after down-scaling in the LUT and a check on delta to see if something could be gained. My guess is that there is still something to gain in this specific situtation if the ceil(log2(x_max-x_min)) < bit_width([x_max, x_min]). In this specific situation we could indeed just center everything and scale to match this observed bit-width.

uint bounds is still not supported, there is something wrong with how we handle it for now. Instead of it just not giving results I found it's best to just raise an error when it's encountered. One bug was detected on CIFAR but another remains. To be followed-up.

cla-bot bot added the cla-signed label Mar 25, 2024

github-advanced-security bot found potential problems Mar 25, 2024

View reviewed changes

tests/parameter_search/test_automatic_rounding.py Dismissed Show dismissed Hide dismissed

andrei-stoian-zama force-pushed the feat/add_automatic_rounding branch from 0e9f3a9 to 5825170 Compare March 26, 2024 16:57

andrei-stoian-zama requested review from jfrery, RomanBredehoft and kcelia March 27, 2024 15:33

andrei-stoian-zama force-pushed the feat/add_automatic_rounding branch 2 times, most recently from deba506 to 2700f25 Compare March 27, 2024 20:11

andrei-stoian-zama changed the base branch from main to chore/update_cp_for_approx_rounding March 27, 2024 20:11

andrei-stoian-zama commented Mar 27, 2024

View reviewed changes

andrei-stoian-zama marked this pull request as ready for review March 27, 2024 20:20

andrei-stoian-zama requested a review from a team as a code owner March 27, 2024 20:20

jfrery force-pushed the chore/update_cp_for_approx_rounding branch from abafd1d to dd3f088 Compare March 28, 2024 13:53

Base automatically changed from chore/update_cp_for_approx_rounding to main March 28, 2024 15:44

andrei-stoian-zama force-pushed the feat/add_automatic_rounding branch from e6e86c6 to 7c05e16 Compare March 28, 2024 17:06

RomanBredehoft reviewed Mar 29, 2024

View reviewed changes

RomanBredehoft requested changes Mar 29, 2024

View reviewed changes

andrei-stoian-zama force-pushed the feat/add_automatic_rounding branch from 7851e27 to a419afb Compare April 18, 2024 09:38

andrei-stoian-zama commented Apr 24, 2024

View reviewed changes

fd0r marked this pull request as draft April 24, 2024 19:21

andrei-stoian-zama commented May 7, 2024

View reviewed changes

andrei-stoian-zama and others added 12 commits June 10, 2024 11:11

fix: bad merge

e1dd112

fix: bad syntax fix: handle all shapes as input to subgraph

fix: fix shape issue

ecba097

Tests are passing with a flaky. Not sure the works does what it's supposed to do fully. Need at least a re-check and probably some comments.

chore: add notebooks about auto-round

392435c

chore: moving in the right direction but still not always exact

d9ec495

chore: still experimenting

4e1e12e

wip

0d34f6e

fix: shape errors with NN

92494b0

fd0r force-pushed the feat/add_automatic_rounding branch from f74557a to 943d66a Compare June 10, 2024 09:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat/add automatic rounding #558

Feat/add automatic rounding #558

andrei-stoian-zama commented Mar 25, 2024 •

edited

andrei-stoian-zama Mar 27, 2024

andrei-stoian-zama Mar 27, 2024

github-actions bot commented Mar 29, 2024

RomanBredehoft Mar 29, 2024

andrei-stoian-zama Apr 18, 2024

andrei-stoian-zama May 7, 2024

RomanBredehoft Mar 29, 2024

RomanBredehoft Mar 29, 2024

RomanBredehoft Mar 29, 2024

RomanBredehoft Mar 29, 2024

RomanBredehoft Mar 29, 2024

RomanBredehoft Mar 29, 2024

RomanBredehoft Mar 29, 2024

RomanBredehoft Mar 29, 2024

RomanBredehoft Mar 29, 2024

RomanBredehoft Mar 29, 2024

RomanBredehoft Mar 29, 2024

RomanBredehoft Mar 29, 2024

RomanBredehoft Mar 29, 2024

RomanBredehoft Mar 29, 2024

RomanBredehoft Mar 29, 2024

RomanBredehoft left a comment

andrei-stoian-zama Apr 24, 2024

fd0r Apr 25, 2024

andrei-stoian-zama May 7, 2024

andrei-stoian-zama May 7, 2024

andrei-stoian-zama May 7, 2024

andrei-stoian-zama May 7, 2024

		@@ -0,0 +1,999 @@
		"""Graph pre-processors for automatic rounding."""



		class TorchAutoRoundingTLUTester(nn.Module):
		"""A small quantized network with Brevitas, trained on make_classification."""

		@@ -0,0 +1,198 @@
		"""Unit tests for TLU optimization preprocessors."""

Feat/add automatic rounding #558

Are you sure you want to change the base?

Feat/add automatic rounding #558

Conversation

andrei-stoian-zama commented Mar 25, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot commented Mar 29, 2024

Coverage failed ❌

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

RomanBredehoft left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

andrei-stoian-zama commented Mar 25, 2024 •

edited