Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat/add automatic rounding #558

Draft
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

andrei-stoian-zama
Copy link
Collaborator

@andrei-stoian-zama andrei-stoian-zama commented Mar 25, 2024

Adds automatic rounding mechanism.

  1. Go through a graph and finds TLUs subgraph nodes.
  2. Analyze the outputs of those TLUs on the calibrated input range.
  3. Identify the intervals (step sizes) where the TLU is constant and the offset where the intervals start
  4. Raises the precision of TLU inputs to map the step sizes to a power of two
  5. Modifies the Graph and the TLU to include rounding and cancel out the raised precision in the TLU
  6. Applies the previous steps on all TLUs for each unique dimension of TLU output (like apply_mapped_lookup_table)
  7. Find the best rounding configuration by taking the one with lowest MSE reconstruction with respect to the original TLU

@cla-bot cla-bot bot added the cla-signed label Mar 25, 2024
@andrei-stoian-zama andrei-stoian-zama force-pushed the feat/add_automatic_rounding branch 2 times, most recently from deba506 to 2700f25 Compare March 27, 2024 20:11
@andrei-stoian-zama andrei-stoian-zama changed the base branch from main to chore/update_cp_for_approx_rounding March 27, 2024 20:11
Makefile Outdated
@@ -54,7 +54,6 @@ setup_env:
echo "Finished installing poetry lock."

echo "Installing $(CONCRETE_PYTHON_VERSION)" && \
poetry run python -m pip install -U --pre "$(CONCRETE_PYTHON_VERSION)"
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no idea why this ishere

conftest.py Outdated
f"# {randomly_seed} # {str(request.node.own_markers)}"
)

print(f"{derivation_string=}")
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to be removed

@andrei-stoian-zama andrei-stoian-zama marked this pull request as ready for review March 27, 2024 20:20
@andrei-stoian-zama andrei-stoian-zama requested a review from a team as a code owner March 27, 2024 20:20
@jfrery jfrery force-pushed the chore/update_cp_for_approx_rounding branch from abafd1d to dd3f088 Compare March 28, 2024 13:53
Base automatically changed from chore/update_cp_for_approx_rounding to main March 28, 2024 15:44
Copy link

Coverage failed ❌

Coverage details

---------- coverage: platform linux, python 3.8.18-final-0 -----------
Name                                      Stmts   Miss  Cover   Missing
-----------------------------------------------------------------------
src/concrete/ml/common/preprocessors.py     329     22    93%   101, 108, 135, 147-148, 206, 308, 378, 455, 674-681, 779, 860-863, 941-942
src/concrete/ml/pytest/torch_models.py      645      5    99%   1628-1629, 1635-1636, 1647
-----------------------------------------------------------------------
TOTAL                                      7511     27    99%

51 files skipped due to complete coverage.

Makefile Outdated
@@ -9,6 +9,7 @@ SRC_DIR:=src
TEST?=tests
N_CPU?=4
CONCRETE_PACKAGE_PATH=$(SRC_DIR)/concrete
SPHINX_APIDOC_EXCLUDE=$(SRC_DIR)/concrete/ml/common/preprocessors.py
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is this exactly ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for some reason sphinx was crashing for this file

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this can now be removed since we don't have sphinx anymore


for node in sorted_nodes:
if node.operation == Operation.Input:
# Deepcopy on the fhe.Graph doesn't modify the input node/indices mappings
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure what this comment is about

}
attributes = {}
if rounding_function.__name__ == "round_bit_pattern":
# These kwargs are not supported atm
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again what is this comment ? should it require a fixme ?

rounding_node.properties["overflow_protection"] = overflow_protection
rounding_node.properties["exactness"] = exactness

rounding_node.bounds = a_node.bounds # Might be over/under-estimated
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again, should this comment be removed ?

@@ -0,0 +1,999 @@
"""Graph pre-processors for automatic rounding."""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall looks good but :

  • very few comments in the actual code, which makes the whole file hard to follow (cannot imagine debugging this in the future)
  • some remaining comments that are confusing / seem irrelevant



class TorchAutoRoundingTLUTester(nn.Module):
"""A small quantized network with Brevitas, trained on make_classification."""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bad copy paste ?

@@ -236,116 +236,7 @@ def q_impl(

p = input2_q_values.shape[-2]

# Remove the manual matrix multiplication when we can handle input precision with rounding
# FIXME: https://github.com/zama-ai/concrete-internal/issues/512
def enc_mul(x, y):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this expected ? we no longer need it ?

reference = f(input_set)

if function_name in ["staircase_pot", "staircase"] and execution_number == 0:
# TODO: round to 1b or eliminate these TLUs entirely
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need a fixme here ?

[circuit_no_optim_no_rounding.simulate(numpy.array([elt])) for elt in input_set]
)[..., 0]

graph_res = numpy.array([circuit.graph(numpy.array([elt])) for elt in input_set])[..., 0]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we still need to test the graph (old VL) ?

@@ -0,0 +1,198 @@
"""Unit tests for TLU optimization preprocessors."""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a bit like above, looks great overall but very few (random) comments that makes the file hard to understand (several times I have no clue what is actually going on)


count_reinterpret = 0
count_tlu = 0
for line in model.quantized_module_.fhe_circuit.mlir.split("\n"):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be a function ? either in source (pytest) or in this file, but looks important

"max_epochs": 10,
}

model = NeuralNetClassifier(**params)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small detail but you could update instantiate_model_generic to be able to provide custom params and then use preamble , makes test easier to control

CURRENT_DIR = Path(__file__).resolve().parent
KEYGEN_CACHE_DIR = CURRENT_DIR.joinpath(".keycache")

# Add MPS (for macOS with Apple Silicon or AMD GPUs) support when error is fixed. For now, we
# observe a decrease in torch's top1 accuracy when using MPS devices
# FIXME: https://github.com/zama-ai/concrete-ml-internal/issues/3953
DEVICE = "cuda" if torch.cuda.is_available() else "cpu"
NUM_SAMPLES = int(os.environ.get("NUM_SAMPLES", 1))

NUM_SAMPLES = int(os.environ.get("NUM_SAMPLES", 1000 if SIMULATE_ONLY else 1))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1000 samples, even with simulate, looks a bit too much no ? why not keep 100 ?

from concrete.ml.quantization import QuantizedModule
from concrete.ml.torch.compile import compile_brevitas_qat_model

SIMULATE_ONLY = True
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we make this an env var ?

@@ -76,11 +82,20 @@ def wrapper(*args, **kwargs):
# cache generated keys through `insecure_key_cache_location`. As the name suggests, these
# parameters are unsafe and should only be used for debugging in development
# Multi-parameter strategy is used in order to speed-up the FHE executions
base_configuration = Configuration()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to be removed ?

import sys

if sys.platform == "darwin":
print("skipping fhe evaluation on darwin platform")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is that ?

Copy link
Collaborator

@RomanBredehoft RomanBredehoft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for this ! overall it looks great, but we might want to clean a bit some stuff at one point: important files/code are missing soem comments to better explain what is going on (some parts are pretty obscure), several now-irrelevant comments to remove, some things could be simplified in tests

assert isinstance(evaluator, GenericEvaluator)

if "subgraph" not in evaluator.properties["kwargs"]:
# Not supported for now
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when does this case arise ? is it for fhe.univariate use?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes univariate or explicit table use in Concrete.

Anyway it's a good check since in the following we try to insert nodes in the subgraph

@fd0r fd0r marked this pull request as draft April 24, 2024 19:21
Makefile Outdated
@@ -9,6 +9,7 @@ SRC_DIR:=src
TEST?=tests
N_CPU?=4
CONCRETE_PACKAGE_PATH=$(SRC_DIR)/concrete
SPHINX_APIDOC_EXCLUDE=$(SRC_DIR)/concrete/ml/common/preprocessors.py
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this can now be removed since we don't have sphinx anymore

Makefile Outdated
@@ -415,7 +416,7 @@ docs_no_links: clean_docs check_docs_dollars
mkdir -p docs/_static/
@# Generate the auto summary of documentations
@# Cannot do without specifying top module currently with sphinx-apidoc
poetry run sphinx-apidoc --implicit-namespaces -o docs/_apidoc $(CONCRETE_PACKAGE_PATH)
poetry run sphinx-apidoc --implicit-namespaces -o docs/_apidoc $(CONCRETE_PACKAGE_PATH) $(SPHINX_APIDOC_EXCLUDE)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

configuration = Configuration(
dump_artifacts_on_unexpected_failures=False,
enable_unsafe_features=True,
use_insecure_key_cache=True,
insecure_key_cache_location=KEYGEN_CACHE_DIR,
additional_pre_processors=[
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need this? we should implement compile_torch_model with rounding_threshold_bits: { n_bits=AUTO, method = approximate }

cfg = Configuration(
verbose=True,
show_optimizer=args.show_optimizer,
additional_pre_processors=[
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see above comment about the right API

andrei-stoian-zama and others added 12 commits June 10, 2024 11:11
fix: update to the new VL

chore: do not test p_error on linear models

chore: remove custom encrypted matmul

chore: add simulation compilation

chore: skip low bit-width rounding in tests

chore: update concrete-python nightly

chore: fix coverage

chore: default to n_jobs = 1 for the CI

chore: fix forbidden words

feat: approximate rounding by default

chore: remove cp install

chore: update licenses for macos-latest-xl

chore: experiment with 3-bit rounding

exp: debugging

Required to modified CP to set preprocessors before the other ones but
it's compiling now.

wip, it breaks

exp: debugging

exp: debugging

exp: debugging

exp: debugging

exp: debugging

exp: debugging

exp: debugging

exp: debugging

exp: debugging

exp: debugging

feat: add rounding torch models tests

chore: add crypto-param tests

feat: more tests

fix: reinstate ceiling bitwidth computation

chore: refactor

chore: fixing tests, some weird behaviors

fix: add test and continue debugging

debugging

debugging

debugging

exp: debugging

exp: debugging

exp: debugging

feat: add more tlu optimization tests

exp: debugging

feat: rework preprocessor

fix: rework tlu optimization

feat: fix automatic rounding bugs

fix: remove all debugging in tests

fix: make conformance

fix: fix pylint and mypy

fix: remove debug files

fix: docstrings

fix: pcc

fix: cp installation

fix: revert deps to defaults for new cp

fix: revert deps to defaults for new cp

fix: pylint and docs

fix: tests

fix: tests 2
fix: bad syntax

fix: handle all shapes as input to subgraph
Tests are passing with a flaky.

Not sure the works does what it's supposed to do fully.

Need at least a re-check and probably some comments.
Still experimenting with it. In the current setting using our approach
instead of the 6-bit rounding everywhere approach we degrade the top-1
accuracy of the CIFAR model too much (from 0.88 with the torch model, to
0.86 with rounding 6-bit to 0.79 for auto-rounding).

I added a notebook to visualize what happens with the auto-rounding
optimization on a simple setting. Things look fishy to me.

Let's keep iterating on it until we find an appropriate setting that
works on the CIFAR-10 use-case and at least another use-case.
Still a work in progress.
I added a small script to help debug the pre-processor.
A few things remain:
- I often have a right or left offset of 1 on the steps functions
- Single jump TLUs are still buggy due to the right value being taken
  not being the expected one
- Some situations were Concrete does not agree with Numpy
- On CIFAR some TLUs have one more bit than expected.
All tests passed once.

Main improvement was a change on the closed form used for the bias.
There is probably still something to improve there.
I also added clipping after down-scaling in the LUT and a check on delta
to see if something could be gained. My guess is that there is still
something to gain in this specific situtation if the
ceil(log2(x_max-x_min)) < bit_width([x_max, x_min]).
In this specific situation we could indeed just center everything and
scale to match this observed bit-width.
uint bounds is still not supported, there is something wrong with how we
handle it for now. Instead of it just not giving results I found it's
best to just raise an error when it's encountered.

One bug was detected on CIFAR but another remains. To be followed-up.
@fd0r fd0r force-pushed the feat/add_automatic_rounding branch from f74557a to 943d66a Compare June 10, 2024 09:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants