Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Importing vmg files #93

Open
buongiorgio opened this issue Feb 11, 2023 · 11 comments
Open

Importing vmg files #93

buongiorgio opened this issue Feb 11, 2023 · 11 comments
Labels
enhancement New feature or request

Comments

@buongiorgio
Copy link

Hello,
I have found on an old PC a bunch of old smses in vmg format, they are from an old nokia phone I had ages ago. Any chance to import them?
Regards.7

@tmo1
Copy link
Owner

tmo1 commented Feb 12, 2023

VMG is apparently an ancient, proprietary format used by Nokia to store SMS messages (see here, here). I'm not going to directly incorporate support for it into SMS I/E, but it may be possible to write a converter to convert VMG messages into SMS I/E compatible JSON. If you're willing to post a few of the files (that do not contain sensitive or private data or metadata), I can take a look at them, and if the format looks simple and easily parseable, I may be able to write such a converter.

@tmo1 tmo1 added the enhancement New feature or request label Feb 12, 2023
@buongiorgio
Copy link
Author

It is ancient indeed, some of the sms I have are 16 years old. Attached a few examples with a couple of notes, note that the original file extension was .VMG, but I had to rename them to upload. Thanks and ask me if you need further information.
2.txt
3.txt
1.txt

@tmo1
Copy link
Owner

tmo1 commented Feb 13, 2023

Attached a few examples with a couple of notes

I assume the original files end with the END:VMSG lines, and everything past that was added by you before uploading?

DT is the timestamp: I suppose is UTC

Yes, those are clearly ISO 8601 combined date and time representations, with the trailing Z denoting the zero UTC offset.

@buongiorgio
Copy link
Author

I assume the original files end with the END:VMSG lines, and everything past that was added by you before uploading?

Yes, that's right.

@tmo1
Copy link
Owner

tmo1 commented Feb 17, 2023

Okay, I have written (in Python 3) an initial attempt at a converter. It seems to work correctly on the three messages you provided, but I don't know how flexible / complex the VMG format is, so I have no idea whether it will work on other VMG messages.

To use it, put some VMG files in a directory containing nothing else (e.g., vmgs), then run (-d sets debugging output):

vmg-convert.py -d vmgs > converted-vmg-messages.json

  1. If any errors are reported, please post them here, along with the files that triggered them. You can redact the files, but please don't add any notes or change anything more than absolutely necessary - notes should be posted here.
  2. Examine the output and see if it looks right, as far as you can tell.
  3. Try to import the file with SMS I/E (preferably into an emulated device or one you don't mind wiping, since if things go wrong, it can be a pain to track down all the messages and delete them).

One thing I'm not sure about is how to correctly distinguish between incoming and outgoing messages. Currently, we simply assume that if X-IRMC-BOX is SENT then it's outgoing, and otherwise it's incoming, but I'm not sure how correct that is.

@buongiorgio
Copy link
Author

Hello,
first attempt was unsuccessful due to file encoding. On linux (debian 9.9 with python 3.5.3) the original files are seen as data:

$ file SMS/0001.vmg
SMS/0001.vmg: data

On a win10 box, they are seen as UTF-16 Little-endian (according to Notepad).

Converted to UTF-16 Big-endian (on win) now linux likes them better:
file vmgs/0040.vmg
vmgs/0040.vmg: Big-endian UTF-16 Unicode text

There is an issue
0040.txt
with the date:
$ ./vmg-convert.py -d vmgs > converted-vmg-messages.json
Processing 0040.vmg
Traceback (most recent call last):
File "./vmg-convert.py", line 58, in
sms['date'] = str(int(datetime.fromisoformat(value).timestamp() * 1000))
AttributeError: type object 'datetime.datetime' has no attribute 'fromisoformat'

@tmo1
Copy link
Owner

tmo1 commented Feb 19, 2023

Hello, first attempt was unsuccessful due to file encoding. On linux (debian 9.9 with python 3.5.3) the original files are seen as data:

$ file SMS/0001.vmg SMS/0001.vmg: data

On a win10 box, they are seen as UTF-16 Little-endian (according to Notepad).

Converted to UTF-16 Big-endian (on win) now linux likes them better: file vmgs/0040.vmg vmgs/0040.vmg: Big-endian UTF-16 Unicode text

This is why I really need to access to files that are as close to the originals as possible. I hard-coded an assumption of UTF-16 since on my Debian Sid system, the versions you posted present as UTF-16 little-endian:

$ file 1.txt 
1.txt: Unicode text, UTF-16, little-endian text, with CRLF, LF line terminators

There is an issue 0040.txt with the date:

Processing 0040.vmg
Traceback (most recent call last):
File "./vmg-convert.py", line 58, in
sms['date'] = str(int(datetime.fromisoformat(value).timestamp() * 1000))
AttributeError: type object 'datetime.datetime' has no attribute 'fromisoformat'

It works fine here, without error. FTR, the file presents as:

$ file 0040.txt 
0040.txt: Unicode text, UTF-16, big-endian text

All four of the files I have begin with proper BOM marks: the first three with FF FE (little-endian), and the fourth with FE FF (big-endian). I wonder if the error you're hitting is somehow being caused by your system improperly understanding the file encoding.

@tmo1
Copy link
Owner

tmo1 commented Feb 19, 2023

On second thought, the problem is clearly that your version of Python is too old - datetime.fromisoformat was introduced in Python 3.7. Try running the tool using a more recent version of Python (you mentioned 3.5.3 - that's a rather old version).

@buongiorgio
Copy link
Author

Hello,
I think the issue was with the python version. I had 3.5 on linux, I tried with the latest (on windows) and I succeeded. I imported a batch of 49 messages, no error during processing. I loaded the resulting file on a Samsung Galaxy S5 with LineageOS 17 (it's a testing device, it was empty) and the SMSes have been imported. They seem correctly handled (UTF chars correctly recognised, timestamp correct, sender correct). I'll try with all the messages and keep you posted.
Thanks!

@buongiorgio
Copy link
Author

I imported the bulk of the messages. So far, the only issue is with the sent messages: they consider the receiver number as the sender. Take for example the 2.txt file, this is the relevant part:

BEGIN:VCARD
VERSION:2.1
N:Nome
TEL:+39338000000
END:VCARD

N:Nome contains the name of the receiver (in this case "Nome", this has been redacted)
TEL:+39338000000 is the phone number of the receiver (redacted)

In the imported file +39338000000 is treated as the number of the sender.

@tmo1
Copy link
Owner

tmo1 commented Feb 20, 2023

In the imported file +39338000000 is treated as the number of the sender.

I'm not sure what you're seeing, but on my device, it looks correct: +39338000000 is shown as the recipient's number ("To:" in Message Details), not the sender's.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants