Git clone this repository.

>> git clone https://github.com/ihaeyong/drama-graph.git

Prepare env. and install requirements:

1. create conda env.
>> conda create -n vtt_env python=3.6
>> source activate vtt_env
2. install pytorch and lib. 
>> conda install pytorch==1.1.0 torchvision==0.3.0 cudatoolkit=9.0 -c pytorch
>> pip install -r requirements.txt

Download AnotherMissOh datasets

Unzip the anothermissoh dataset,

>> mkdir ./data
>> cd data
>> unzip datasets.zip

You can set your data path to 'Yolo_v2_pytorch/src/anotherMissOh_dataset.py' as follows:

img_path = './data/AnotherMissOh/AnotherMissOh_images/AnotherMissOh01/'
json_dir = './data/AnotherMissOh/AnotherMissOh_Visual/AnotherMissOh01_visual.json'

Drama-graph model

We mainly use YOLOv2-pytorch.

You could find all trained models in this link.

We finetuned YOLOv2 w.r.t 20 persons for about 50 epoches as follows:

Train model:

train the integrated model from scratch.

For place recognition, make 'pre_model' folder and put places365 pre-trained model.

>> ./scripts/train_models.sh #gpu

train sound event model from scratch.

>> ./scripts/train_sound_event.sh #gpu

The trained model is saved in ./sound_event_detection/checkpoint/torch_model.pt

Test(drama graph generation) model:

drama graph generation on the all episodes.

>> ./scripts/eval_models.sh #gpu

Evaluation

mAP for person

>> python eval_mAP.py -rtype person

mAP for behavior

>> python eval_mAP.py -rtype behave

mAP for face

>> python eval_mAP.py -rtype face

mAP for relation

>> python eval_mAP.py -rtype relation

mAP for object

>> python eval_mAP.py -rtype object

Accuracy for sound event and its visualization. sed_vis folder should be in the directory from which you run the file(./), so here the directory is (drama-graph/sed_vis/).

>> ./scripts/eval_sound_event.sh
>> ./scripts/inference_sound_event.sh

Performances

model	trainset	validation	test
person detection	54.2%	50.6%	47.3%
face detection	43.9%	25.83	26.6%
emotion	72.6%	80.6%	66.9%
behavior	17.43%	3.9%	4.89%
object detection	2.18%	1.17%	1.33%
predicate	88.1%	88.8%	85.9%
place	61.0%	41.1%	38.8%
sound event	89.6%	69.0%	62.5%

Acknowledgements

This work was supported by Institute for Information & communications Technology Promotion(IITP) grant funded by the Korea government(MSIT) (2017-0-01780, The technology development for event recognition/relational reasoning and learning knowledge based system for video understanding)

Name		Name	Last commit message	Last commit date
Latest commit History 318 Commits
.idea		.idea
Yolo_v2_pytorch		Yolo_v2_pytorch
graph_caption		graph_caption
jupyter		jupyter
knowledge_graph		knowledge_graph
lib		lib
license_report		license_report
logs/voc_person_sgd		logs/voc_person_sgd
models		models
script_esf		script_esf
scripts		scripts
sed_vis		sed_vis
sound_event_detection		sound_event_detection
.gitignore		.gitignore
.style.yapf		.style.yapf
.travis.yml		.travis.yml
LICENSE		LICENSE
Learner.py		Learner.py
README.md		README.md
__init__.py		__init__.py
config.py		config.py
eval_mAP.py		eval_mAP.py
model.py		model.py
prepare_AnotherMissOh.py		prepare_AnotherMissOh.py
requirements.txt		requirements.txt
test_missoh_all_images.py		test_missoh_all_images.py
train.py		train.py
train_main.py		train_main.py
utils.py		utils.py
verifacation.py		verifacation.py

License

ihaeyong/drama-graph

Folders and files

Latest commit

History

Repository files navigation

Git clone this repository.

Prepare env. and install requirements:

Download AnotherMissOh datasets

Drama-graph model

Train model:

Test(drama graph generation) model:

Evaluation

Performances

Acknowledgements

About

Topics

Resources

License

Stars

Watchers

Forks

Languages