π₯ MedMNIST-C: benchmark dataset based on the MedMNIST+ collection covering 12 2D datasets and 9 imaging modalities.
π₯ Github: https://github.com/francescodisalvo05/medmnistc-api
π Paper: https://arxiv.org/abs/2406.17536v2
π₯Dataset: https://paperswithcode.com/dataset/imagenet-c
@Machine_learn
pip install medmnistc
π₯Dataset: https://paperswithcode.com/dataset/imagenet-c
@Machine_learn
Please open Telegram to view this post
VIEW IN TELEGRAM
β€2π₯1
This media is not supported in your browser
VIEW IN TELEGRAM
π£ MARS 5 TTS
βͺGithub: https://github.com/Camb-ai/MARS5-TTS
βͺDemo: https://www.camb.ai/
βͺHF: https://huggingface.co/CAMB-AI/MARS5-TTS
βͺColab: https://colab.research.google.com/github/Camb-ai/mars5-tts/blob/master/mars5_demo.ipynb
@Machine_learn
βͺGithub: https://github.com/Camb-ai/MARS5-TTS
βͺDemo: https://www.camb.ai/
βͺHF: https://huggingface.co/CAMB-AI/MARS5-TTS
βͺColab: https://colab.research.google.com/github/Camb-ai/mars5-tts/blob/master/mars5_demo.ipynb
@Machine_learn
This media is not supported in your browser
VIEW IN TELEGRAM
π SEE-2-SOUND - a method for generating complex spatial sound based on images and videos
β pip install see2sound
π₯ GitHub
π‘ Hugging Face
π‘ Arxiv
@Machine_learn
β pip install see2sound
π₯ GitHub
π‘ Hugging Face
π‘ Arxiv
@Machine_learn
π₯5
Seq2Seq: Sequence-to-Sequence Generator
π₯ Github: https://github.com/fiy2w/mri_seq2seq
π Paper: https://arxiv.org/abs/2407.02911v1
π₯Dataset: https://paperswithcode.com/task/contrastive-learning
@Machine_learn
π₯ Github: https://github.com/fiy2w/mri_seq2seq
π Paper: https://arxiv.org/abs/2407.02911v1
π₯Dataset: https://paperswithcode.com/task/contrastive-learning
@Machine_learn
π3π₯1
Ψ³ΩΨ§Ω
Ψ―ΩΨ³ΨͺΨ§ΩΫ Ϊ©Ω Ω
ΩΨ§ΩΩ Ψ―Ψ§Ψ±Ω Ω
Ϋ ΨͺΩΩΩ Ψ¨Ω Ψ§ΫΩ ΪΩΨ±ΩΨ§Ω Ψ¨ΩΨ±Ψ³ΨͺΩ Ω Ω
Ω Ω Ψ¨Ω ΨΉΩΩΨ§Ω Ψ―Ψ§ΩΨ± Ω
ΨΉΨ±ΩΫ Ϊ©ΩΩ
@Machine_learn
@Machine_learn
π8β€3π₯1
Minutes to Seconds: Speeded-up DDPM-based Image Inpainting with Coarse-to-Fine Sampling
π₯ Github: https://github.com/linghuyuhangyuan/m2s
π Paper: https://arxiv.org/abs/2407.05875v1
π₯Dataset: https://paperswithcode.com/task/denoising
@Machine_learn
π₯Dataset: https://paperswithcode.com/task/denoising
@Machine_learn
Please open Telegram to view this post
VIEW IN TELEGRAM
π3π₯2β€1
πβπ¨ LongVA: Long Context Transfer from Language to Vision
βͺGithub: https://github.com/EvolvingLMMs-Lab/LongVA
βͺPaper: https://arxiv.org/abs/2406.16852
βͺProject: https://lmms-lab.github.io/posts/longva/
βͺDemo: https://longva-demo.lmms-lab.com/
@Machine_learn
βͺGithub: https://github.com/EvolvingLMMs-Lab/LongVA
βͺPaper: https://arxiv.org/abs/2406.16852
βͺProject: https://lmms-lab.github.io/posts/longva/
βͺDemo: https://longva-demo.lmms-lab.com/
@Machine_learn
β€1π₯1
Unified Embedding Alignment for Open-Vocabulary Video Instance Segmentation (ECCV 2024)
π₯ Github: https://github.com/fanghaook/ovformer
π Paper: https://arxiv.org/abs/2407.07427v1
@Machine_learn
@Machine_learn
Please open Telegram to view this post
VIEW IN TELEGRAM
Multimodal contrastive learning for spatial gene expression prediction using histology images
π₯ Github: https://github.com/modelscope/data-juicer
π Paper: https://arxiv.org/abs/2407.08583v1
π Dataset: https://paperswithcode.com/dataset/coco
@Machine_learn
π Dataset: https://paperswithcode.com/dataset/coco
@Machine_learn
Please open Telegram to view this post
VIEW IN TELEGRAM
π4π₯2
π An Empirical Study of Mamba-based Pedestrian Attribute Recognition
π₯ Github: https://github.com/event-ahu/openpar
π Paper: https://arxiv.org/pdf/2407.10374v1.pdf
π Dataset: https://paperswithcode.com/dataset/peta
@Machine_learn
π Dataset: https://paperswithcode.com/dataset/peta
@Machine_learn
Please open Telegram to view this post
VIEW IN TELEGRAM
Aligning Sight and Sound: Advanced Sound Source Localization Through Audio-Visual Alignment
π₯ Github: https://github.com/kaistmm/SSLalignment
π Paper: https://arxiv.org/abs/2407.13676v1
π Dataset: https://paperswithcode.com/dataset/is3-interactive-synthetic-sound-source
@Machine_learn
π Dataset: https://paperswithcode.com/dataset/is3-interactive-synthetic-sound-source
@Machine_learn
Please open Telegram to view this post
VIEW IN TELEGRAM
π2β€1
π MG-LLaVA - multimodal LLM with advanced capabilities for working with visual information
Just recently, the guys from Shanghai University rolled out MG-LLaVA - MLLM, which expands the capabilities of processing visual information through the use of additional components: special components that are responsible for working with low and high resolution.
MG-LLaVA integrates an additional high-resolution visual encoder to capture fine details, which are then combined with underlying visual features using the Conv-Gate network.
Trained exclusively on publicly available multimodal data, MG-LLaVA achieves excellent results.
π‘ MG-LLaVA page
π₯ GitHub
@Machine_learn
Just recently, the guys from Shanghai University rolled out MG-LLaVA - MLLM, which expands the capabilities of processing visual information through the use of additional components: special components that are responsible for working with low and high resolution.
MG-LLaVA integrates an additional high-resolution visual encoder to capture fine details, which are then combined with underlying visual features using the Conv-Gate network.
Trained exclusively on publicly available multimodal data, MG-LLaVA achieves excellent results.
π‘ MG-LLaVA page
π₯ GitHub
@Machine_learn
π2
Aligning Sight and Sound: Advanced Sound Source Localization Through Audio-Visual Alignment
π₯ Github: https://github.com/kaistmm/SSLalignment
π Paper: https://arxiv.org/abs/2407.13676v1
π Dataset: https://paperswithcode.com/dataset/is3-interactive-synthetic-sound-source
@Machine_learn
π₯ Github: https://github.com/kaistmm/SSLalignment
π Paper: https://arxiv.org/abs/2407.13676v1
π Dataset: https://paperswithcode.com/dataset/is3-interactive-synthetic-sound-source
@Machine_learn
π₯3
π Dataset: https://paperswithcode.com/dataset/behave
@Machine_learn
Please open Telegram to view this post
VIEW IN TELEGRAM
π2
β‘οΈ EMO-Disentanger
π₯ Github: https://github.com/yuer867/emo-disentanger
π Paper: https://arxiv.org/abs/2407.20955v1
π Dataset: https://paperswithcode.com/dataset/emopia
@Machine_learn
π Dataset: https://paperswithcode.com/dataset/emopia
@Machine_learn
Please open Telegram to view this post
VIEW IN TELEGRAM
π3
How to Think Like a Computer Scientist: Interactive Edition
https://runestone.academy/ns/books/published/thinkcspy/index.html
@Machine_learn
https://runestone.academy/ns/books/published/thinkcspy/index.html
@Machine_learn
π9