Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The update method in the UCB algorithm is inconsistent with the paper and code #180

Open
kerala21 opened this issue Mar 31, 2024 · 1 comment

Comments

@kerala21
Copy link

Q(p) for each prompt in the UCB algorithm of the paper is updated to Q(p) + r/N(p),

Uploading 2024331203750.jpg…

The following table describes the project update code

def update(self, chosen, scores):

    for i, score in zip(chosen, scores):
        self.counts[i] += self.num_samples
        self.scores[i] += score * self.num_samples

Doesn't match

@donglixp
Copy link
Contributor

The jpg file is unavailable.

@donglixp donglixp reopened this May 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants