Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(not-defining): Hog #22336

Merged
merged 81 commits into from
May 28, 2024
Merged

feat(not-defining): Hog #22336

merged 81 commits into from
May 28, 2024

Conversation

mariusandra
Copy link
Collaborator

@mariusandra mariusandra commented May 16, 2024

Changes

Adds a "hog" feature flag which enables:

  • a the "Hog" tab for insights
image
  • a "Hog" button under /debug:

image

The feature

Does this work well for both Cloud and self-hosted?

Yes

How did you test this code?

See the PR

Out of scope / next steps

  • The C++ parser for Hog. Currently this is using the Python one since it's easier to build. We'll port it over when speed of parsing becomes a problem.
  • All of the JS async flow.

@posthog-bot
Copy link
Contributor

It looks like the code of hogql-parser has changed since last push, but its version stayed the same at 1.0.8. 👀
Make sure to resolve this in hogql_parser/setup.py before merging!

@mariusandra
Copy link
Collaborator Author

@benjackwhite ready for a look 👀

@mariusandra mariusandra requested a review from a team May 28, 2024 10:26
Copy link
Contributor

@benjackwhite benjackwhite left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

99% trusting the actual hog implementation itself. Some comments about the current implemented functions.
I'm definitely on the "lets go" camp now but I still think the danger things that we already know are a bit riskier make sense to perhaps leave out now and add "properly" (i.e. sending http calls via our webhook delivery service or something, console logs stored in some state instead of being actually written to the console)

raise HogVMException(f"Invalid bytecode. Must start with '{HOGQL_BYTECODE_IDENTIFIER}'")

while (symbol := next(iterator, None)) is not None:
def check_timeout():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice

expect(await executeHogQLBytecode(['_h', op.INTEGER, 0, op.INTEGER, 1, op.AND, 2], fields)).toBe(false)
expect(await executeHogQLBytecode(['_h', op.INTEGER, 1, op.INTEGER, 0, op.INTEGER, 1, op.OR, 3], fields)).toBe(
true
expect(await exec(['_h'], fields)).toBe(null)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Long term - we should make this run both python and nodejs in parallel against the same test cases if possible

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely. Especially as we're talking about adding a third target, Rust, as well.

return args[0][::-1]


def httpGet(name: str, args: list[Any], team: Optional[Team], stdout: Optional[list[str]], timeout: int):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tbh this is the only bit im not super in favour of having already as its the main place where we will have "dangerzone" thoughts in terms of long term running operations, DNS spoofing etc. If there's no rush, could we not add this later properly instead?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(╯°□°)╯︵ ┻━┻

It's gone

Comment on lines 42 to 66
httpGet: (args, _, timeout) => {
async function httpGet(url: string, timeout: number = 5): Promise<string | null> {
const timeoutPromise = new Promise<null>((_, reject) => {
setTimeout(() => {
reject(new Error(`Request timed out after ${timeout} seconds`))
}, timeout * 1000)
})

try {
const response = await Promise.race([fetch(url), timeoutPromise])
if (!response || !response.ok) {
throw new Error('Network response was not ok.')
}
return await response.text()
} catch (error) {
throw new Error(`Failed to fetch: ${error.message}`)
}
}

return httpGet(args[0], timeout)
},
print: (args) => {
// eslint-disable-next-line no-console
console.log(...args)
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both of these are the potential "danger" functions - one does arbitrary http calls and the other write to console and therefor IO. Likely we would want an execution wrapper that actually takes these calls and performs them carefully. Happy to approve and move forward but feels like something that could for sure be pushed back?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. httpGet is gone. I kept the print for now, but will rework it in the next PR to append to a log array just like the Python version does. I don't see a big danger as the JS VM is not used anywhere now.

@mariusandra mariusandra merged commit 45d275c into master May 28, 2024
88 checks passed
@mariusandra mariusandra deleted the hog branch May 28, 2024 12:51
Comment on lines 27 to +31
class TestParserPython(parser_test_factory("python")):
pass
def _program(self, program: str, placeholders: dict[str, ast.Expr] | None = None) -> ast.Program:
return parse_program(program, placeholders=placeholders, start=None)

def test_program_variable_declarations(self):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not make TestParserPython diverge from TestPythonCpp. They should inherit all the same stuff from parser_test_factory(), so that the implementations stay in sync. Ofc we can move faster with the Python one, but once in master, we should converge ASAP @mariusandra

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely. Adding C++ parsing support for Hog and fixing this is somewhere high on my list.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants