Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Video frame extraction #4085

Open
wants to merge 94 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
94 commits
Select commit Hold shift + click to select a range
423560b
mermaid front-end rendering initialization exception handling logic o…
Oct 24, 2023
948157c
del mermaid front-end rendering rerender
Oct 25, 2023
d8ca863
application embedded add chrome && ChatBot Chrome plugin update v1.5
Nov 7, 2023
f722800
Style optimization
Nov 7, 2023
e530083
chore: ui
crazywoola Nov 7, 2023
67d2bb0
chore: ui
crazywoola Nov 7, 2023
a2d30d0
Merge remote-tracking branch 'origin/main' into embedded-chrome-plugin
Nov 9, 2023
6457b8f
Merge remote-tracking branch 'origin/embedded-chrome-plugin' into emb…
Nov 9, 2023
a33fbd9
Chrome PlugIn Url Change
Nov 9, 2023
43fa756
azure openai add gpt-4-1106-preview、gpt-4-vision-preview models
Dec 13, 2023
414534c
Update web/i18n/lang/common.en.ts
charli117 Dec 13, 2023
a310fed
Update web/i18n/lang/common.zh.ts
charli117 Dec 13, 2023
07e3e01
change azure chat _generate function replace langchain _convert_messa…
Dec 13, 2023
65e980d
Merge pull request #3 from charli117/azure-openai-add-gpt4v&1106
charli117 Dec 14, 2023
d492e5a
Merge pull request #4 from charli117/mermaid-front-end-optimize
charli117 Dec 14, 2023
5af4ff6
Merge pull request #5 from charli117/embedded-chrome-plugin
charli117 Dec 14, 2023
7c9fd91
Merge branch 'langgenius:main' into main
charli117 Dec 19, 2023
df70726
Merge branch 'langgenius:main' into main
charli117 Dec 21, 2023
9731821
Merge branch 'langgenius:main' into main
charli117 Dec 23, 2023
cd97b61
Merge branch 'langgenius:main' into main
charli117 Dec 26, 2023
c9bebec
Merge branch 'langgenius:main' into main
charli117 Jan 4, 2024
4107879
Merge branch 'langgenius:main' into main
charli117 Jan 4, 2024
8a41876
Merge branch 'langgenius:main' into main
charli117 Jan 5, 2024
7d42fd7
Merge branch 'langgenius:main' into main
charli117 Jan 5, 2024
057c253
Merge branch 'langgenius:main' into main
charli117 Jan 8, 2024
876e431
Merge branch 'langgenius:main' into main
charli117 Jan 8, 2024
27869ca
Merge branch 'langgenius:main' into main
charli117 Jan 9, 2024
304e83d
Merge branch 'langgenius:main' into main
charli117 Jan 15, 2024
eb895f7
Merge branch 'langgenius:main' into main
charli117 Jan 18, 2024
22823b8
Merge branch 'langgenius:main' into main
charli117 Jan 20, 2024
515b5df
Merge branch 'langgenius:main' into main
charli117 Jan 23, 2024
de20db2
Merge branch 'langgenius:main' into main
charli117 Jan 23, 2024
3d57b6d
Merge branch 'langgenius:main' into main
charli117 Jan 24, 2024
1fe2843
Merge branch 'langgenius:main' into main
charli117 Jan 24, 2024
ed66e9d
Merge branch 'langgenius:main' into main
charli117 Jan 24, 2024
767dc1a
Merge branch 'langgenius:main' into main
charli117 Jan 24, 2024
32c4f97
Merge branch 'langgenius:main' into main
charli117 Jan 26, 2024
4168c9a
Merge branch 'langgenius:main' into main
charli117 Jan 26, 2024
c8858de
Merge branch 'langgenius:main' into main
charli117 Jan 26, 2024
5502518
Merge branch 'langgenius:main' into main
charli117 Jan 28, 2024
1615765
Merge branch 'langgenius:main' into main
charli117 Jan 30, 2024
404dcab
Merge branch 'langgenius:main' into main
charli117 Jan 30, 2024
03132ba
Merge branch 'langgenius:main' into main
charli117 Jan 31, 2024
e0f4e70
Merge branch 'langgenius:main' into main
charli117 Feb 5, 2024
859e320
Merge branch 'langgenius:main' into main
charli117 Feb 5, 2024
40122b9
Merge branch 'langgenius:main' into main
charli117 Feb 5, 2024
a052150
Merge branch 'langgenius:main' into main
charli117 Feb 6, 2024
7aa36ef
Merge branch 'langgenius:main' into main
charli117 Feb 7, 2024
4a1d695
Merge branch 'langgenius:main' into main
charli117 Feb 7, 2024
271262e
Merge branch 'langgenius:main' into main
charli117 Feb 8, 2024
09389ee
Merge branch 'langgenius:main' into main
charli117 Feb 9, 2024
2030292
Merge branch 'langgenius:main' into main
charli117 Feb 13, 2024
64a213a
Merge branch 'langgenius:main' into main
charli117 Feb 14, 2024
e499052
Merge branch 'langgenius:main' into main
charli117 Feb 15, 2024
c3380b7
Merge branch 'langgenius:main' into main
charli117 Feb 15, 2024
92505e0
Merge branch 'langgenius:main' into main
charli117 Feb 15, 2024
dd47440
Merge branch 'langgenius:main' into main
charli117 Feb 16, 2024
f22dd96
Merge branch 'langgenius:main' into main
charli117 Feb 18, 2024
f278346
Merge branch 'langgenius:main' into main
charli117 Feb 21, 2024
0f8821e
Merge branch 'langgenius:main' into main
charli117 Feb 28, 2024
8954daf
Merge branch 'langgenius:main' into main
charli117 Feb 28, 2024
6b51618
Merge branch 'langgenius:main' into main
charli117 Mar 3, 2024
decffd5
Merge branch 'langgenius:main' into main
charli117 Mar 4, 2024
d47b669
Merge branch 'langgenius:main' into main
charli117 Mar 6, 2024
4e89118
Merge branch 'langgenius:main' into main
charli117 Mar 13, 2024
7abcb6f
Merge branch 'langgenius:main' into main
charli117 Apr 22, 2024
8ceced9
Merge branch 'langgenius:main' into main
charli117 Apr 26, 2024
f6659d8
optimize the knowledge failed documents query
Apr 26, 2024
b591a43
Merge branch 'langgenius:main' into main
charli117 Apr 26, 2024
3b326e7
add avg rag delay time monitoring curve
Apr 26, 2024
2e5b17d
Merge branch 'langgenius:main' into main
charli117 Apr 26, 2024
effc7a5
add elapsed_time to return_retriever_resource_info
Apr 26, 2024
492c9ab
fix lose elapsed_time
Apr 26, 2024
6920d37
fix I001 error
Apr 26, 2024
fa16bed
fix sort-imports
Apr 26, 2024
2fa5703
Merge remote-tracking branch 'langgenius/main' into video-frame-extra…
May 3, 2024
6772791
Merge remote-tracking branch 'langgenius/main' into video-frame-extra…
May 4, 2024
0a493c0
Video content inference based on frame extraction
May 4, 2024
76eb2ad
ruff error fix
May 4, 2024
e3cdf32
ruff error fix
May 4, 2024
d3299fa
Merge remote-tracking branch 'public/video-frame-extraction' into vid…
May 4, 2024
b16cd90
fix web style errors
May 4, 2024
799806b
fix error when extracting only video audio content
May 4, 2024
5514d6d
Delete api/.dockerignore
charli117 May 4, 2024
3ceb86d
Delete api/config.py
charli117 May 4, 2024
f1395f0
Restore file
May 4, 2024
c7eca87
Merge remote-tracking branch 'public/video-frame-extraction' into vid…
May 4, 2024
f70aa6e
Restore file
May 4, 2024
1dbdc64
Restore file
May 4, 2024
592a765
Restore file
May 4, 2024
de1d90c
Restore file
May 4, 2024
1022aba
fix when either video frame extraction or audio extraction disabled j…
May 5, 2024
1aad855
fix python style errors
patryk20120 May 5, 2024
d67e9fe
clear console log and message log unable to display video
May 5, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
2 changes: 1 addition & 1 deletion api/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -272,7 +272,7 @@ def __init__(self):
self.SMTP_USERNAME = get_env('SMTP_USERNAME')
self.SMTP_PASSWORD = get_env('SMTP_PASSWORD')
self.SMTP_USE_TLS = get_bool_env('SMTP_USE_TLS')

# ------------------------
# Workspace Configurations.
# ------------------------
Expand Down
2 changes: 1 addition & 1 deletion api/controllers/console/app/completion.py
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@ def post(self, app_model):
except ValueError as e:
raise e
except Exception as e:
logging.exception("internal server error.")
logging.exception(f"internal server error, {str(e)}")
raise InternalServerError()


Expand Down
2 changes: 1 addition & 1 deletion api/controllers/console/datasets/file.py
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ def post(self):
if len(request.files) > 1:
raise TooManyFilesError()
try:
upload_file = FileService.upload_file(file, current_user)
upload_file = FileService.upload_file(file=file, user=current_user, tenant_id=current_user.current_tenant_id)
except services.errors.file.FileTooLargeError as file_too_large_error:
raise FileTooLargeError(file_too_large_error.description)
except services.errors.file.UnsupportedFileTypeError:
Expand Down
1 change: 0 additions & 1 deletion api/controllers/files/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,5 +5,4 @@
bp = Blueprint('files', __name__)
api = ExternalApi(bp)


from . import image_preview, tool_files
8 changes: 4 additions & 4 deletions api/controllers/files/tool_files.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,10 +20,10 @@ def get(self, file_id, extension):
args = parser.parse_args()

if not ToolFileManager.verify_file(file_id=file_id,
timestamp=args['timestamp'],
nonce=args['nonce'],
sign=args['sign'],
):
timestamp=args['timestamp'],
nonce=args['nonce'],
sign=args['sign'],
):
raise Forbidden('Invalid request.')

try:
Expand Down
3 changes: 2 additions & 1 deletion api/controllers/web/file.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ class FileApi(WebApiResource):
def post(self, app_model, end_user):
# get file from request
file = request.files['file']
source = request.args.get('source')

# check file
if 'file' not in request.files:
Expand All @@ -23,7 +24,7 @@ def post(self, app_model, end_user):
if len(request.files) > 1:
raise TooManyFilesError()
try:
upload_file = FileService.upload_file(file, end_user)
upload_file = FileService.upload_file(file=file, user=end_user, tenant_id=app_model.tenant_id, source=source)
except services.errors.file.FileTooLargeError as file_too_large_error:
raise FileTooLargeError(file_too_large_error.description)
except services.errors.file.UnsupportedFileTypeError:
Expand Down
1 change: 1 addition & 0 deletions api/core/app/app_config/entities.py
Original file line number Diff line number Diff line change
Expand Up @@ -188,6 +188,7 @@ class FileExtraConfig(BaseModel):
File Upload Entity.
"""
image_config: Optional[dict[str, Any]] = None
video_config: Optional[dict[str, Any]] = None


class AppAdditionalFeatures(BaseModel):
Expand Down
20 changes: 17 additions & 3 deletions api/core/app/app_config/features/file_upload/manager.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,8 @@ def convert(cls, config: dict, is_vision: bool = True) -> Optional[FileExtraConf
:param config: model config args
:param is_vision: if True, the feature is vision feature
"""
video_config = None
image_config = None
file_upload_dict = config.get('file_upload')
if file_upload_dict:
if 'image' in file_upload_dict and file_upload_dict['image']:
Expand All @@ -24,9 +26,21 @@ def convert(cls, config: dict, is_vision: bool = True) -> Optional[FileExtraConf
if is_vision:
image_config['detail'] = file_upload_dict['image']['detail']

return FileExtraConfig(
image_config=image_config
)
if 'video' in file_upload_dict and file_upload_dict['video']:
video_config = dict()
if file_upload_dict['video']['extract_video'] == 'enabled':
video_config.update({
'extract_video': file_upload_dict['video']['extract_video'],
'max_collect_frames': file_upload_dict['video']['max_collect_frames'],
'similarity_threshold': file_upload_dict['video']['similarity_threshold'],
'blur_threshold': file_upload_dict['video']['blur_threshold']
})
if file_upload_dict['video']['extract_audio'] == 'enabled':
video_config.update({
'extract_audio': file_upload_dict['video']['extract_audio']
})

return FileExtraConfig(image_config=image_config, video_config=video_config)

return None

Expand Down
140 changes: 124 additions & 16 deletions api/core/file/file_obj.py
Original file line number Diff line number Diff line change
@@ -1,18 +1,31 @@
import base64
import enum
import logging
from typing import Optional

import click
from pydantic import BaseModel

from core.app.app_config.entities import FileExtraConfig
from core.file.file_parser_cache import FileParserCache
from core.file.tool_file_parser import ToolFileParser
from core.file.upload_file_parser import UploadFileParser
from core.model_runtime.entities.message_entities import ImagePromptMessageContent
from core.model_runtime.entities.message_entities import (
ImagePromptMessageContent,
TextPromptMessageContent,
VideoPromptMessageContent,
)
from extensions.ext_database import db
from models.model import UploadFile
from models.account import Account
from models.model import App, UploadFile
from services.audio_service import AudioService
from services.extract_video_frames import ExtractVideoFrames
from services.file_service import FileService


class FileType(enum.Enum):
IMAGE = 'image'
VIDEO = 'video'

@staticmethod
def value_of(value):
Expand All @@ -34,6 +47,7 @@ def value_of(value):
return member
raise ValueError(f"No matching enum found for value '{value}'")


class FileBelongsTo(enum.Enum):
USER = 'user'
ASSISTANT = 'assistant'
Expand All @@ -57,6 +71,8 @@ class FileVar(BaseModel):
filename: Optional[str] = None
extension: Optional[str] = None
mime_type: Optional[str] = None
app_id: Optional[str] = None
description: Optional[str] = None

def to_dict(self) -> dict:
return {
Expand All @@ -69,6 +85,7 @@ def to_dict(self) -> dict:
'filename': self.filename,
'extension': self.extension,
'mime_type': self.mime_type,
'description': self.video_text,
}

def to_markdown(self) -> str:
Expand All @@ -93,6 +110,21 @@ def data(self) -> Optional[str]:
"""
return self._get_data()

@property
def video_text(self) -> Optional[str]:
"""
Get video data, file signed url or base64 data
depending on config MULTIMODAL_SEND_IMAGE_FORMAT
:return:
"""
audio_text = self._get_video_text()
if isinstance(audio_text, bytes):
audio_data = audio_text.decode('utf-8')
else:
audio_data = audio_text
logging.info(click.style(f"video text: {audio_data}", fg='green'))
return audio_data

@property
def preview_url(self) -> Optional[str]:
"""
Expand All @@ -102,34 +134,110 @@ def preview_url(self) -> Optional[str]:
return self._get_data(force_url=True)

@property
def prompt_message_content(self) -> ImagePromptMessageContent:
if self.type == FileType.IMAGE:
image_config = self.extra_config.image_config
def prompt_message_content(
self) -> ImagePromptMessageContent | VideoPromptMessageContent | TextPromptMessageContent:
image_config = self.extra_config.image_config
video_config = self.extra_config.video_config

if self.type == FileType.IMAGE:
return ImagePromptMessageContent(
data=self.data,
detail=ImagePromptMessageContent.DETAIL.HIGH
if image_config.get("detail") == "high" else ImagePromptMessageContent.DETAIL.LOW
)
if self.type == FileType.VIDEO:
if video_config.get('extract_video') != 'enabled' and video_config.get('extract_audio') == 'enabled':
return TextPromptMessageContent(data=self.video_text)
elif video_config.get('extract_video') == 'enabled' and video_config.get('extract_audio') != 'enabled':
return ImagePromptMessageContent(
data=self.data,
detail=ImagePromptMessageContent.DETAIL.HIGH
if image_config.get("detail") == "high" else ImagePromptMessageContent.DETAIL.LOW
)
elif video_config.get('extract_video') == 'enabled' and video_config.get('extract_audio') == 'enabled':
return VideoPromptMessageContent(
data=self.data,
detail=VideoPromptMessageContent.DETAIL.HIGH
if image_config.get("detail") == "high" else VideoPromptMessageContent.DETAIL.LOW,
description=self.video_text
)
else:
raise ValueError('Either video frame extraction or audio extraction one of them must be enabled!')

def _get_data(self, force_url: bool = False) -> Optional[str]:
if self.type == FileType.IMAGE:
if self.transfer_method == FileTransferMethod.REMOTE_URL:
return self.url
elif self.transfer_method == FileTransferMethod.LOCAL_FILE:
upload_file = (db.session.query(UploadFile)
.filter(
UploadFile.id == self.related_id,
UploadFile.tenant_id == self.tenant_id
).first())

return UploadFileParser.get_image_data(
upload_file=upload_file,
force_url=force_url
)
upload_file = db.session.query(UploadFile).filter(UploadFile.id == self.related_id,
UploadFile.tenant_id == self.tenant_id).first()

file_cache = FileParserCache(file_id=upload_file.id,
file_type=upload_file.extension,
separation_type='image')
if file_cache.get():
image_data = file_cache.get()
else:
image_data = UploadFileParser.get_image_data(upload_file=upload_file, force_url=force_url)
file_cache.set(file_content=image_data, ttl=3600)
return image_data
elif self.transfer_method == FileTransferMethod.TOOL_FILE:
extension = self.extension
# add sign url
return ToolFileParser.get_tool_file_manager().sign_file(tool_file_id=self.related_id, extension=extension)
return ToolFileParser.get_tool_file_manager().sign_file(tool_file_id=self.related_id,
extension=extension)
if self.type == FileType.VIDEO:
video_config = self.extra_config.video_config

upload_file = db.session.query(UploadFile).filter(UploadFile.id == self.related_id,
UploadFile.tenant_id == self.tenant_id).first()

# Video frame extraction and audio extraction
if video_config.get('extract_video') == 'enabled':
file_cache = FileParserCache(file_id=upload_file.id,
file_type=upload_file.extension,
separation_type='video')
if file_cache.get():
video_data = file_cache.get()
else:
data = ExtractVideoFrames(max_collect_frames=video_config['max_collect_frames'],
similarity_threshold=video_config['similarity_threshold'],
blur_threshold=video_config['blur_threshold'],
file=upload_file).process_video()
if force_url is True:
image_upload_file = FileService.upload_file(file=data,
file_name=f'{upload_file.name.split(".")[0]}.jpg',
tenant_id=upload_file.tenant_id)
video_data = UploadFileParser.get_signed_temp_image_url(upload_file_id=image_upload_file.id)
else:
encoded_string = base64.b64encode(data).decode('utf-8')
video_data = f'data:image/jpeg;base64,{encoded_string}'
file_cache.set(file_content=video_data, ttl=3600)
return video_data
return None

def _get_video_text(self) -> Optional[str]:
"""
Get video text data
:return:
"""
if self.type == FileType.VIDEO:
video_config = self.extra_config.video_config

if video_config.get('extract_audio') == 'enabled':
upload_file = db.session.query(UploadFile).filter(UploadFile.id == self.related_id,
UploadFile.tenant_id == self.tenant_id).first()

file_cache = FileParserCache(file_id=upload_file.id, file_type=upload_file.extension,
separation_type='audio')
if file_cache.get():
return file_cache.get()
elif not file_cache.get() and self.app_id:
user_info = db.session.query(Account).filter(Account.id == upload_file.created_by).first()
app_info = db.session.query(App).filter(App.id == self.app_id).first()

audio_text = AudioService.transcript_asr(app_model=app_info, file=upload_file, end_user=user_info)
audio_data = audio_text.get('text').strip()
file_cache.set(file_content=audio_data)
return audio_data
return None
42 changes: 42 additions & 0 deletions api/core/file/file_parser_cache.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
from typing import Optional

from extensions.ext_redis import redis_client


class FileParserCache:
def __init__(self, file_id: str, file_type: str, separation_type: str):
self.cache_key = f"media:{separation_type}:{file_id}.{file_type}"

def get(self) -> Optional[dict]:
"""
Get cached model provider credentials.

:return:
"""
cached_file_parser = redis_client.get(self.cache_key)
if cached_file_parser:
try:
cached_file_parser = cached_file_parser.decode('utf-8')
except:
pass
return cached_file_parser
else:
return None

def set(self, file_content: str, ttl: Optional[int] = 86400) -> None:
"""
Cache model provider credentials.

:param file_content: file content
:param ttl: cache expiration time in seconds
:return:
"""
redis_client.setex(name=self.cache_key, time=ttl, value=file_content)

def delete(self) -> None:
"""
Delete cached model provider credentials.

:return:
"""
redis_client.delete(self.cache_key)