FineTuning #453

FFFiend · 2023-08-03T16:49:27Z

What kind of change does this PR introduce?
addresses #69

Summary

Checklist

My code follows the style guidelines of OpenAdapt
I have performed a self-review of my code
If applicable, I have added tests to prove my fix is functional/effective
I have linted my code locally prior to submission
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation (e.g. README.md, requirements.txt)
New and existing unit tests pass locally with my changes

How can your code be run and tested?

Other information

FFFiend · 2023-08-16T20:58:04Z

Refactoring and OpenAI Tuner completed, along with some additional changes to function names.

FFFiend · 2023-08-17T06:28:27Z

TODO: add metrics infrastructure for finetuning

Some useful repos: https://github.com/lxe/simple-llm-finetuner, and https://github.com/kuleshov-group/llmtune/tree/main

lora.py This links to theirlora.py file, which I believe we intended to at least investigate in the past. (#208)
The docstring of the class states that

class QuantLoraModel(torch.nn.Module):
    """
    Creates Low Rank Adapter (Lora) model from a pretrained transformers model.

    Args:
        model ([`transformers.PreTrainedModel`]): The model to be adapted.
        config ([`LoraConfig`]): The configuration of the Lora model.

    Returns:
        `torch.nn.Module`: The Lora model.
.
.
.

so this might be a direct solution to that issue too.

As the name of the repo suggests, it's meant for consumer grade GPU finetuning, we can surely implement something for CPU too.

LaPetiteSouris · 2023-08-17T16:03:42Z

openadapt/ml/openai/openai_finetune.py

+
+
+def generate_file_path(recording_id: int) -> str:
+    return f"{recording_id}_processed.jsonl"


Just thinking out loud: should this file be in /tmp instead of current dir ? Do we actually need this file permanently ?

We could just pass in a shell command to delete the file that was generated using subprocess immediately after the file has been uploaded to OpenAI for finetunign. WDYT?

LaPetiteSouris · 2023-08-17T16:05:18Z

openadapt/ml/openai/openai_finetune.py

+import subprocess
+
+
+class OpenAIFineTuner(FineTuner):


Neat pick: May be it is a good idea to put a small README.md file with a few lines of example for documentation.

From a new comer point of view, it is hard to know which method should be called first and which one later.

Noted, will add this asap.

LaPetiteSouris · 2023-08-17T16:07:25Z

openadapt/ml/fine_tuning/base_finetuner.py

+        condensed_data = self._condense_data(recording)
+        return condensed_data
+
+


Thanks for implementing this. This looks great ! 👍 💯

FFFiend · 2023-08-17T21:05:59Z

TODO: add tests, address comments.

FFFiend · 2023-08-23T05:07:37Z

Model used to finetune: Davinci.
Current output on a snippet of the same recording that I finetuned the model on:

"({'name': 'move', 'mouse_x': 359.66015625, 'mouse_y': 118.9765625, 'element_state': {}}, {'title': 'Terminal openadapt \u2014 poetry shell \u25b8 Python \u2014 124\u00d740', 'left': 283, 'top': 109, 'width': 878, 'height': 595, 'window_id': 1129})

({'name': 'move', 'mouse_x': 363.30859375, 'mouse_y': 107.890625, 'element_state': {}}, {'title': 'Terminal openadapt \u2014 poetry shell \u25b8 Python \u2014 124\u00d740', 'left': 283, 'top': 109, 'width': 878, 'height': 595, 'window_id': 1129})

({'name': 'move', 'mouse_x': 368.55078125, 'mouse_y': 99.62890625, 'element_state': {}}, {'title': 'Terminal openadapt \u2014 poetry shell \u25b8 Python \u2014 124\u00d740', 'left': 283, 'top': 109, 'width': 878, 'height': 595, 'window_id': 1129})

({'name': 'move', 'mouse_x': 373.83203125, 'mouse_y': 90.75390625, 'element_state': {}}, {'title': 'Terminal openadapt \u2014 poetry shell \u25b8 Python \u2014 124\u00d740', 'left': 283, 'top': 109, 'width': 878, 'height': 595, 'window_id': 1129})

({'name': 'move', 'mouse_x': 378.2265625, 'mouse_y': 82.43359375, 'element_state': {}}, {"

expected output:

({'name': 'move', 'mouse_x': 356.796875, 'mouse_y': 124.1640625, 'element_state': {}}, {'title': 'Terminal openadapt \u2014 poetry shell \u25b8 Python \u2014 124\u00d740', 'left': 283, 'top': 109, 'width': 878, 'height': 595, 'window_id': 1129})

({'name': 'move', 'mouse_x': 361.0234375, 'mouse_y': 112.890625, 'element_state': {}}, {'title': 'Terminal openadapt \u2014 poetry shell \u25b8 Python \u2014 124\u00d740', 'left': 283, 'top': 109, 'width': 878, 'height': 595, 'window_id': 1129})

({'name': 'move', 'mouse_x': 362.83203125, 'mouse_y': 109.265625, 'element_state': {}}, {'title': 'Terminal openadapt \u2014 poetry shell \u25b8 Python \u2014 124\u00d740', 'left': 283, 'top': 109, 'width': 878, 'height': 595, 'window_id': 1129})

({'name': 'move', 'mouse_x': 364.9765625, 'mouse_y': 103.36328125, 'element_state': {}}, {'title': 'Terminal openadapt \u2014 poetry shell \u25b8 Python \u2014 124\u00d740', 'left': 283, 'top': 109, 'width': 878, 'height': 595, 'window_id': 1129})

({'name': 'move', 'mouse_x': 366.78515625, 'mouse_y': 99.73828125, 'element_state': {}}, {'title': 'Terminal openadapt \u2014 poetry shell \u25b8 Python \u2014 124\u00d740', 'left': 283, 'top': 109, 'width': 878, 'height': 595, 'window_id': 1129})

UPDATE: Need to try fine-tuning the newly available gpt-3.5-turbo-0613 because you can use generate ChatCompletions on it.

abrichr · 2023-08-24T19:18:48Z

tests/openadapt/test_openai_finetune.py

+            "prompt": "({'name': 'move', 'mouse_x': 354.65234375, 'mouse_y': 130.06640625, 'element_state': {}}, {'title': 'Terminal openadapt \u2014 poetry shell \u25b8 Python \u2014 124\u00d740', 'left': 283, 'top': 109, 'width': 878, 'height': 595, 'window_id': 1129})",
+            "completion": " ({'name': 'move', 'mouse_x': 356.796875, 'mouse_y': 124.1640625, 'element_state': {}}, {'title': 'Terminal openadapt \u2014 poetry shell \u25b8 Python \u2014 124\u00d740', 'left': 283, 'top': 109, 'width': 878, 'height': 595, 'window_id': 1129})",
+        },
+    ]


Can you please move this to tests/assets/fixtures.json or similar?

…cement-Org/OpenAdapt into context_window_reduc

FFFiend · 2023-09-01T08:18:57Z

TODO: flake8 errors before merging

FFFiend and others added 30 commits June 23, 2023 01:24

added layout replaystrat mixin (partially complete)

e4b7490

fixed return value

3ea5cf1

removed extra space

ed74db9

removed unintended methods

b9453c8

reordered imports

0d3000d

ordered imports

009dd8d

changed type contract of path list to str

f8f2c90

Merge branch 'MLDSAI:main' into main

4e59ccc

fixed formatting

d17e35c

Merge branch 'MLDSAI:main' into main

bdd2ad5

Merge branch 'OpenAdaptAI:main' into main

7903161

added test images and complete test suite.

7b9c9b2

added failure case handling for unsupported doc types (gui and such)

ff85ce9

fixed file paths

775c3d9

Merge branch 'OpenAdaptAI:main' into main

f4ff09b

removed layoutlm file

a1c56c8

Merge branch 'OpenAdaptAI:main' into main

6de8c69

Merge branch 'OpenAdaptAI:main' into main

dc18646

created folders, started on model api file

836c0b0

added init file

8eaa432

added check model availability function

7465107

changed name of folder to ml_models

b6ef7d6

got rid of layout files

fd1af9f

changed to correct syntax

96373f1

Merge branch 'OpenAdaptAI:main' into models_branch

30e7a84

started on model api implementation

73751cc

added get_recording_by_id func

9fe6f8d

created finetune and infer files, fleshed out gpt4 dir

e6a5baf

fleshed out gpt4 dir

5d3c892

added a few details

31c9853

FFFiend added 2 commits August 16, 2023 03:38

added base finetuning class as per recommendations

8cf6385

edited finetuner class

46c65d6

FFFiend changed the title ~~OpenAI FineTuning~~ FineTuning Aug 16, 2023

FFFiend added 2 commits August 16, 2023 16:54

added write to file method

2f55727

renamed functions

3cda14b

fixed capital in docstring

ad9050a

refactored files

9c218fd

LaPetiteSouris reviewed Aug 17, 2023

View reviewed changes

FFFiend added 6 commits August 18, 2023 00:22

added finetune test (finetuner infra works but inference TBD)

fe62a18

see prev commit msg

b7568b7

added README

5e9659b

fixed readme

a5210fe

added test_finetuned_completion function

2ed4cde

added more to test file, almost ready

03163e4

FFFiend added 2 commits August 23, 2023 01:11

added recording completion test code

4871349

readded code after using max token size

a607cb4

abrichr reviewed Aug 24, 2023

View reviewed changes

Merge branch 'OpenAdaptAI:main' into context_window_reduc

3c1fd0a

LaPetiteSouris mentioned this pull request Aug 28, 2023

feat(docs): RFC on Searching for Focus Component on UI #469

Open

7 tasks

FFFiend and others added 6 commits August 29, 2023 19:24

Merge branch 'OpenAdaptAI:main' into context_window_reduc

d18b308

moved to json

c7d71bd

updated crud

12c5b46

Merge branch 'OpenAdaptAI:main' into context_window_reduc

2b62695

added fixture func

46fdf8b

Merge branch 'context_window_reduc' of https://github.com/Owais-Enhan…

0c14a6e

…cement-Org/OpenAdapt into context_window_reduc

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FineTuning #453

FineTuning #453

FFFiend commented Aug 3, 2023 •

edited

FFFiend commented Aug 16, 2023 •

edited

FFFiend commented Aug 17, 2023 •

edited

LaPetiteSouris Aug 17, 2023

FFFiend Aug 18, 2023

LaPetiteSouris Aug 17, 2023

FFFiend Aug 18, 2023

LaPetiteSouris Aug 17, 2023

FFFiend commented Aug 17, 2023

FFFiend commented Aug 23, 2023 •

edited

abrichr Aug 24, 2023 •

edited

FFFiend commented Sep 1, 2023



		def generate_file_path(recording_id: int) -> str:
		return f"{recording_id}_processed.jsonl"

		condensed_data = self._condense_data(recording)
		return condensed_data

FineTuning #453

Are you sure you want to change the base?

FineTuning #453

Conversation

FFFiend commented Aug 3, 2023 • edited

FFFiend commented Aug 16, 2023 • edited

FFFiend commented Aug 17, 2023 • edited

LaPetiteSouris Aug 17, 2023

Choose a reason for hiding this comment

FFFiend Aug 18, 2023

Choose a reason for hiding this comment

LaPetiteSouris Aug 17, 2023

Choose a reason for hiding this comment

FFFiend Aug 18, 2023

Choose a reason for hiding this comment

LaPetiteSouris Aug 17, 2023

Choose a reason for hiding this comment

FFFiend commented Aug 17, 2023

FFFiend commented Aug 23, 2023 • edited

abrichr Aug 24, 2023 • edited

Choose a reason for hiding this comment

FFFiend commented Sep 1, 2023

FFFiend commented Aug 3, 2023 •

edited

FFFiend commented Aug 16, 2023 •

edited

FFFiend commented Aug 17, 2023 •

edited

FFFiend commented Aug 23, 2023 •

edited

abrichr Aug 24, 2023 •

edited