Telegram Web Link
This media is not supported in your browser
VIEW IN TELEGRAM
🌟 SEE-2-SOUND - a method for generating complex spatial sound based on images and videos

β€” pip install see2sound

πŸ–₯ GitHub
🟑 Hugging Face
🟑 Arxiv

@Machine_learn
Ψ³Ω„Ψ§Ω… دوسΨͺΨ§Ω†ΫŒ Ϊ©Ω‡ Ω…Ω‚Ψ§Ω„Ω‡ Ψ―Ψ§Ψ±Ω† Ω…ΫŒ ΨͺΩˆΩ†Ω† Ψ¨Ω‡ Ψ§ΫŒΩ† Ϊ˜ΩˆΨ±Ω†Ψ§Ω„ بفرسΨͺΩ† و Ω…Ω† و Ψ¨Ω‡ ΨΉΩ†ΩˆΨ§Ω† داور Ω…ΨΉΨ±ΩΫŒ Ϊ©Ω†Ω†
@Machine_learn
Minutes to Seconds: Speeded-up DDPM-based Image Inpainting with Coarse-to-Fine Sampling

πŸ–₯ Github: https://github.com/linghuyuhangyuan/m2s

πŸ“• Paper: https://arxiv.org/abs/2407.05875v1

πŸ”₯Dataset: https://paperswithcode.com/task/denoising

@Machine_learn
Please open Telegram to view this post
VIEW IN TELEGRAM
πŸ‘β€πŸ—¨ LongVA: Long Context Transfer from Language to Vision

β–ͺGithub: https://github.com/EvolvingLMMs-Lab/LongVA
β–ͺPaper: https://arxiv.org/abs/2406.16852
β–ͺProject: https://lmms-lab.github.io/posts/longva/
β–ͺDemo: https://longva-demo.lmms-lab.com/

@Machine_learn
Unified Embedding Alignment for Open-Vocabulary Video Instance Segmentation (ECCV 2024)

πŸ–₯ Github: https://github.com/fanghaook/ovformer

πŸ“• Paper: https://arxiv.org/abs/2407.07427v1

@Machine_learn
Please open Telegram to view this post
VIEW IN TELEGRAM
Multimodal contrastive learning for spatial gene expression prediction using histology images

πŸ–₯ Github: https://github.com/modelscope/data-juicer

πŸ“• Paper: https://arxiv.org/abs/2407.08583v1

πŸš€ Dataset: https://paperswithcode.com/dataset/coco

@Machine_learn
Please open Telegram to view this post
VIEW IN TELEGRAM
🌟 An Empirical Study of Mamba-based Pedestrian Attribute Recognition

πŸ–₯ Github: https://github.com/event-ahu/openpar

πŸ“• Paper: https://arxiv.org/pdf/2407.10374v1.pdf

πŸš€ Dataset: https://paperswithcode.com/dataset/peta

@Machine_learn
Please open Telegram to view this post
VIEW IN TELEGRAM
Aligning Sight and Sound: Advanced Sound Source Localization Through Audio-Visual Alignment

πŸ–₯ Github: https://github.com/kaistmm/SSLalignment

πŸ“• Paper: https://arxiv.org/abs/2407.13676v1

πŸš€ Dataset: https://paperswithcode.com/dataset/is3-interactive-synthetic-sound-source

@Machine_learn
Please open Telegram to view this post
VIEW IN TELEGRAM
🌟 MG-LLaVA - multimodal LLM with advanced capabilities for working with visual information

Just recently, the guys from Shanghai University rolled out MG-LLaVA - MLLM, which expands the capabilities of processing visual information through the use of additional components: special components that are responsible for working with low and high resolution.

MG-LLaVA integrates an additional high-resolution visual encoder to capture fine details, which are then combined with underlying visual features using the Conv-Gate network.

Trained exclusively on publicly available multimodal data, MG-LLaVA achieves excellent results.

🟑 MG-LLaVA page
πŸ–₯ GitHub

@Machine_learn
Aligning Sight and Sound: Advanced Sound Source Localization Through Audio-Visual Alignment

πŸ–₯ Github: https://github.com/kaistmm/SSLalignment

πŸ“• Paper: https://arxiv.org/abs/2407.13676v1

πŸš€ Dataset: https://paperswithcode.com/dataset/is3-interactive-synthetic-sound-source

@Machine_learn
πŸ–₯ StackFLOW: Monocular Human-Object Reconstruction by Stacked Normalizing Flow with Offset.

πŸ–₯ Github: https://github.com/huochf/StackFLOW

πŸ“• Paper: https://arxiv.org/abs/2407.20545v1

πŸš€ Dataset: https://paperswithcode.com/dataset/behave

@Machine_learn
Please open Telegram to view this post
VIEW IN TELEGRAM
Please open Telegram to view this post
VIEW IN TELEGRAM
How to Think Like a Computer Scientist: Interactive Edition

https://runestone.academy/ns/books/published/thinkcspy/index.html

@Machine_learn
No learning rates needed: Introducing SALSA - Stable Armijo Line Search Adaptation

πŸ–₯ Github: https://github.com/themody/no-learning-rates-needed-introducing-salsa-stable-armijo-line-search-adaptation

πŸ“• Paper: https://arxiv.org/abs/2407.20650v1

πŸš€ Dataset: https://paperswithcode.com/dataset/cifar-10

βœ…@Machine_learn
Please open Telegram to view this post
VIEW IN TELEGRAM
πŸ’¨ Scaling hierarchical agglomerative clustering to trillion-edge graphs


https://research.google/blog/scaling-hierarchical-agglomerative-clustering-to-trillion-edge-graphs/

βœ…@Machine_learn
Please open Telegram to view this post
VIEW IN TELEGRAM
2025/07/04 09:51:57
Back to Top
HTML Embed Code: