Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#8563: sweep split_query_key_value_and_split_heads, split and concat #8610

Merged
merged 3 commits into from
Jun 5, 2024

Conversation

sjameelTT
Copy link
Contributor

@sjameelTT sjameelTT commented May 17, 2024

Add new sweep tests for ttnn.transformers.split_query_key_value_and_split_heads, split and concat

split_query_key_value_and_split_heads:

  • Batch_size, and cores_h are together in a tuple to minimize permutations that are expected to fail
  • (Num_q_heads, num_kv_heads, cores_w) and (seq_len_q, seq_len_kv) are also in a tuple for the same reason
  • If all required combinations were tested with all data types and memory configurations then we would have 24192 as opposed to the current 2918
  • PCC will be low for the interleaved version since the sharded and interleaved versions both expect the logical QKV tensor to be concatenated along different dimensions
  • TODO: add expected failure cases too for configurations that shouldn't be supported

split:

  • use ttnn.experimental.tensor.split_dim_two_chunks_tiled instead of ttnn test for now since split is not implemented and just a wrapper
  • add some known working configs since the current split is hardcoded to split in half
  • TODO: ttnn.split should be implemented, potentially call the ttnn.experimental versino
  • TODO: remove known working config cases when ttnn.split is implemented

concat:

  • concat tests reworked to be able to track the dimensions for each test case (makes for easier debugging)

@sjameelTT sjameelTT force-pushed the sjameel/sweeps branch 7 times, most recently from ef8ed30 to 11570d3 Compare May 24, 2024 18:45
@sjameelTT sjameelTT changed the title #8563: sweep split_query_key_value_and_split_heads #8563: sweep split_query_key_value_and_split_heads, split and concat May 27, 2024
@sjameelTT
Copy link
Contributor Author

- Batch_size, and cores_h are together in a tuple to minimize permutations that are expected to fail
- (Num_q_heads, num_kv_heads, cores_w) and (seq_len_q, seq_len_kv) are also in a tuple for the same reason
- If all required combinations were tested with all data types and memory configurations then we would have 24192 as opposed to the current 2918
- PCC will be low for the interleaved version since the sharded and interleaved versions both expect the logical QKV tensor to be concatenated along different dimensions
- TODO: add expected failure cases too for configurations that shouldn't be supported
- concat tests reworked to be able to track the dimensions for each test case (makes for easier debugging)
- use ttnn.experimental.tensor.split_dim_two_chunks_tiled instead of the ttnn split for now since split is not implemented and just a wrapper
- add some known working configs since the current split is hardcoded to split in half
- TODO: ttnn.split should be implemented, potentially call the ttnn.experimental versino
- TODO: remove known working config cases when ttnn.split is implemented
@sjameelTT sjameelTT merged commit 6220520 into main Jun 5, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants