Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Pipelines on Hop server sometimes hangs in state "Waiting", with a little information in log. #3861

Open
Oen1997 opened this issue Apr 25, 2024 · 3 comments

Comments

@Oen1997
Copy link

Oen1997 commented Apr 25, 2024

Apache Hop version?

2.8

Java version?

openjdk 21.0.3 2024-04-16 LTS

Operating system

Linux

What happened?

Status in Web UI:
image

Log lines in Web UI:
image

Server log:
image

Issue Priority

Priority: 2

Issue Component

Component: Hop Server

@AlefRP
Copy link

AlefRP commented May 6, 2024

Request for Additional Pipeline Details

Hello Oen1997,

Thanks for your input on the pipeline setup. To better address the issue, could you provide more detailed information about the pipeline configuration? Specifically:

  1. Transforms Used:

    • Could you list the specific transforms and their configurations that are currently used in the pipeline?
  2. CSV Input:

    • Is there a CSV input in the pipeline? If so, it would be helpful to check whether the 'lazy conversion' checkbox is marked. This option helps manage memory more efficiently by delaying data type conversions.
  3. Data Carrying Between Pipelines:

    • Please verify if the pipeline is carrying tables from one stage to another excessively. This practice can lead to memory overflow and introduce bugs, especially when combined with database lookup transforms.

Your detailed feedback on these points will help us identify and fix the issue more effectively.

Best regards,
AlefRP

@Oen1997
Copy link
Author

Oen1997 commented May 7, 2024

Pipeline is very simple, output is only ~2 million rows:
image

Avro target folder is set on GS.
By the way, in parallel Im running similar pipelines (different times) on another software, another server, and uploading to GS large CSV's via "gcloud storage cp" command, and that never failed, so I would exclude network related issues.

@Oen1997
Copy link
Author

Oen1997 commented May 8, 2024

Another pipeline is hanging for few days, with zero info in log:
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants