Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[tsv-] let options set NUL delimiters #2275

Merged
merged 3 commits into from
Feb 4, 2024
Merged

Conversation

midichef
Copy link
Contributor

@midichef midichef commented Jan 29, 2024

Closes #2272.

I also added --row-delimiter= to use '\x00' as the row separator. That'll be useful for the output of find -print0 and for input for xargs -0.

@anjakefala
Copy link
Collaborator

Hey @midichef!

I'm trying to confirm this worked with find visidata -print0 | vd --row-delimiter='', but it does not to be seperating the output into rows:

Screenshot from 2024-02-01 21-42-15

Is there something I am testing incorrectly?

@midichef
Copy link
Contributor Author

midichef commented Feb 3, 2024

Right now the option will only apply when the filetype is tsv. So the command would need to be find visidata -print0 | vd -f tsv --row-delimiter=''

I added some warnings when the field/row delimiter is NUL, because it may surprise people. One situation I'm thinking of is: someone has run vd file.tsv and is exploring the options sheet after hitting O. They hit e to edit the blank cell for the global option for delimiter. In that cell appears the · character (disp_unprintable, for '\t'). They erase it and hit Enter. The blank cell looks the same as before they started editing it, so they don't think much of it. But they have unwittingly overwritten the default value of vd.options.delimiter with the empty string '', so the actual delimiter will be '\0'. Then they save their TSV file, and it's full of '\0'. And life is less great. A warning can help them realize what happened.

@midichef
Copy link
Contributor Author

midichef commented Feb 3, 2024

The issue is more complicated than I realized though. How should we handle comments when the delimiter is NUL?

i.e. TsvSheet.options.regex_skip = '^#.*' will currently skip over lines that look like comments. But it should definitely not do that when handling the output of find -print0.

My intuition is, regex_skip should not be used when the row delimiter is NUL, as we're not in classic TSV format any more.

@anjakefala
Copy link
Collaborator

My intuition is, regex_skip should not be used when the row delimiter is NUL, as we're not in classic TSV format any more.

I would say, let's add that to an issue as a follow-up task, but get this piece merged in for now. No need to block the PR.

@anjakefala anjakefala merged commit ebf5696 into saulpw:develop Feb 4, 2024
13 checks passed
@midichef midichef deleted the tsv_nul branch February 5, 2024 02:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[tsv] add CLI option to use NUL as delimiter
3 participants