Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability to read comments/metadata from sound files #1656

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

Wizzerinus
Copy link
Contributor

@Wizzerinus Wizzerinus commented May 14, 2024

Issue description

This PR allows loading metadata/comments/tags from sound files and exposes those to the programmer. Example code:

from direct.showbase.ShowBase import ShowBase
from panda3d.core import loadPrcFileData

loadPrcFileData("", "audio-library-name p3openal_audio")
b = ShowBase()
f = b.loader.loadMusic("file.ogg")
print(list(f.raw_comments))  # something like ['ARTIST=artist', 'TITLE=my music track\n']
print(f.comments['TITLE'])  # something like 'my music traсk'

Solution description

Metadata is exposed through two virtual methods, one on the AudioCursor, and another one on the AudioSound, linked through some binding code in AudioManager.

We have 3 sound managers, FMod, Null and OpenAL. FMod is way too encapsulated to do anything, and Null is Null, so those 2 managers return an empty string when the comment is read. For OpenAL, the method depends on the underlying AudioCursor class.

There are seven types of AudioCursor: FFmpeg, Opus, Flac, Wav, Vorbis, Microphone and Userdata. Only three of these currently have an implementation:

  • Implementation for FFmpeg is based on https://ffmpeg.org/doxygen/0.7/metadata-example_8c-source.html. Note that FFmpeg uses a different data structure from ogg, which had to be converted to vector<string>.
  • Implementation for Opus and Vorbis is more or less the same with minor differences based on how the Tag struct is accessed.
  • There is no implementation for Flac. The reason being the Panda3D's dr_flac implementation being extremely old and that version does not have the metadata iterator. This can be revisited in the future when/if the bundled dr_flac is updated.
  • Wav does not have an implementation seeing as Panda uses bytestring interaction to parse those files, and the standard does not seem to have a Metadata field. Therefore using getComment() returns an empty string here also.
  • Microphone and Userdata don't have an implementation for an obvious reason.

Due to an interrogate limitation, the published C++ functions return string which is obtained by joining the vector of comments with newlines. Not sure if there's a better way. no longer the case, see below

I've tested the implementation on Libvorbis which seems to work correctly, did not test FFmpeg and Opus (besides ensuring the engine compiles and can be imported without ABI errors) since there is no documentation on how to force Panda to use a specific cursor loader.

Checklist

I have done my best to ensure that…

  • …I have familiarized myself with the CONTRIBUTING.md file
  • …this change follows the coding style and design patterns of the codebase
  • …I own the intellectual property rights to this code
  • …the intent of this change is clearly explained
  • …existing uses of the Panda3D API are not broken
  • …the changed code is adequately covered by the test suite, where possible.

@rdb
Copy link
Member

rdb commented May 14, 2024

Great work, thank you! Some comments.

We're trying to phase out Python extensions added in direct. They're only available when direct is loaded. Prefer a C++ interface if at all possible, or a C++-implemented extension method (see _ext.h and _ext.cxx files) if nothing else is possible.

Doing it in C++ would mean making sequential access methods (which means you could parse it into a SimpleHashMap, that supports both key-based and indexed access) and exposing those to Python with MAKE_MAP_PROPERTY - do a grep for that to search for examples. We can keep the C++ methods making up the map-property as unpublished, so that the property is the only way to access it from Python.

An alternative is making an AudioTags class which exposes a mapping interface (overriding operator []), and so behaves like a dictionary. Then we just need a getter property on the AudioSound. This would be the preferred method if the interface is bigger (like an AudioMetadata containing more than just tags), so that the AudioSound interface isn't cluttered up. If it's just a dict of tags, I think MAKE_MAP_PROPERTY might be easier.

I don't think that we should have this interface on both AudioSound and MovieAudioCursor. Intuitively, MovieAudio should be the place for this, but it seems some implementations don't actually open the sound file until the cursor is created, so MovieAudioCursor sounds like the place to get it. Is there a reason to also have it on AudioSound?

Would it be interesting to give thought to standardizing the keys between different formats, so that you can use the same code to access the title of an .mp3 file as for an .ogg file, via a named enum or named getters, or do you think the original keys should be kept?

Feel free to ping me on Discord for more discussion.

@Wizzerinus
Copy link
Contributor Author

Wizzerinus commented May 14, 2024

We're trying to phase out Python extensions added in direct. They're only available when direct is loaded. Prefer a C++ interface if at all possible, or a C++-implemented extension method (see _ext.h and _ext.cxx files) if nothing else is possible.
Doing it in C++ would mean making sequential access methods (which means you could parse it into a SimpleHashMap, that supports both key-based and indexed access) and exposing those to Python with MAKE_MAP_PROPERTY - do a grep for that to search for examples. We can keep the C++ methods making up the map-property as unpublished, so that the property is the only way to access it from Python.

Implemented this. An interesting consequence now is that you can no longer "parse all comments", only "comments with a given named tag". Not sure whether this is a significant issue or not. I could still expose the string as something like getCommentString() (since getRawComment() is now a vector). EDIT: could also make a list property for getRawComment(). EDIT2: this is now done.

An alternative is making an AudioTags class which exposes a mapping interface (overriding operator []), and so behaves like a dictionary. Then we just need a getter property on the AudioSound. This would be the preferred method if the interface is bigger (like an AudioMetadata containing more than just tags), so that the AudioSound interface isn't cluttered up. If it's just a dict of tags, I think MAKE_MAP_PROPERTY might be easier.

Decided against it since it felt overkill.

I don't think that we should have this interface on both AudioSound and MovieAudioCursor. Intuitively, MovieAudio should be the place for this, but it seems some implementations don't actually open the sound file until the cursor is created, so MovieAudioCursor sounds like the place to get it. Is there a reason to also have it on AudioSound?

AudioSound is returned from loader, so I think it only makes sense. Other 2 having it is implementation detail because Panda3D actually has two levels of abstraction you have to go through.

Would it be interesting to give thought to standardizing the keys between different formats, so that you can use the same code to access the title of an .mp3 file as for an .ogg file, via a named enum or named getters, or do you think the original keys should be kept?

I could implement a series of getters like getAuthor(), getTitle(), etc. but this seems overkill having on AudioSound and putting them into a separate class isn't much better I think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants