OpenGVLab/Ask-Anything
a simple yet interesting tool for chatting about video with chatGPT, miniGPT4 and StableLM
Language: Python
#captioning_videos #chat #chatgpt #gradio #langchain #moss #stablelm #video #video_question_answering #video_understanding
Stars: 294 Issues: 2 Forks: 15
https://github.com/OpenGVLab/Ask-Anything
a simple yet interesting tool for chatting about video with chatGPT, miniGPT4 and StableLM
Language: Python
#captioning_videos #chat #chatgpt #gradio #langchain #moss #stablelm #video #video_question_answering #video_understanding
Stars: 294 Issues: 2 Forks: 15
https://github.com/OpenGVLab/Ask-Anything
GitHub
GitHub - OpenGVLab/Ask-Anything: [CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs…
[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS. - OpenGVLab/Ask-Anything
❤2
THUDM/GLM-4.1V-Thinking
GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning.
Language: Python
#image2text #reasoning #video_understanding #vlm
Stars: 449 Issues: 9 Forks: 8
https://github.com/THUDM/GLM-4.1V-Thinking
GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning.
Language: Python
#image2text #reasoning #video_understanding #vlm
Stars: 449 Issues: 9 Forks: 8
https://github.com/THUDM/GLM-4.1V-Thinking
GitHub
GitHub - zai-org/GLM-V: GLM-4.6V/4.5V/4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
GLM-4.6V/4.5V/4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning - zai-org/GLM-V
❤1
bytedance/Lance
A 3B-active-parameter native unified multimodal model for image and video understanding, generation, and editing.
Language: Python
#image_editing #image_generation #image_understanding #unified_multimodal_models #video_generation #video_understanding
Stars: 696 Issues: 10 Forks: 38
https://github.com/bytedance/Lance
A 3B-active-parameter native unified multimodal model for image and video understanding, generation, and editing.
Language: Python
#image_editing #image_generation #image_understanding #unified_multimodal_models #video_generation #video_understanding
Stars: 696 Issues: 10 Forks: 38
https://github.com/bytedance/Lance
GitHub
GitHub - bytedance/Lance: A 3B-active-parameter native unified multimodal model for image and video understanding, generation,…
A 3B-active-parameter native unified multimodal model for image and video understanding, generation, and editing. - bytedance/Lance
❤1