Enhancing Semantics in Multimodal Chain of Thought via Soft Negative Sampling
🖥 Github: https://github.com/zgmin/snse-cot
📕 Paper: https://paperswithcode.com/dataset/scienceqa
@Machine_learn
@Machine_learn
Please open Telegram to view this post
VIEW IN TELEGRAM
GitHub
GitHub - zgMin/SNSE-CoT: Official implementation for "Enhancing Semantics in Multimodal Chain of Thought via Soft Negative Sampling"
Official implementation for "Enhancing Semantics in Multimodal Chain of Thought via Soft Negative Sampling" - zgMin/SNSE-CoT
❤2
💡 Lumina-T2X: Transforming Text into Any Modality, Resolution, and Duration via Flow-based Large Diffusion Transformers
▪Github: https://github.com/alpha-vllm/lumina-t2x
▪Paper: https://arxiv.org/abs/2405.05945
▪Demo: https://lumina.sylin.host/
@Machine_learn
▪Github: https://github.com/alpha-vllm/lumina-t2x
▪Paper: https://arxiv.org/abs/2405.05945
▪Demo: https://lumina.sylin.host/
@Machine_learn
👍1
Awesome-Text-to-Video-Generation Awesome
🖥 Github: https://github.com/soraw-ai/awesome-text-to-video-generation
📕 Paper: https://arxiv.org/abs/2405.10674v1
@Machine_learn
@Machine_learn
Please open Telegram to view this post
VIEW IN TELEGRAM
👍1
This media is not supported in your browser
VIEW IN TELEGRAM
⚡️ Deblur-GS: 3D Gaussian Splatting from Camera Motion Blurred Images
▪Code: https://github.com/Chaphlagical/Deblur-GS
▪Paper: https://chaphlagical.icu/Deblur-GS/static/paper/Deblur_GS_author_version.pdf
▪Project: https://chaphlagical.icu/Deblur-GS/
@Machine_learn
▪Code: https://github.com/Chaphlagical/Deblur-GS
▪Paper: https://chaphlagical.icu/Deblur-GS/static/paper/Deblur_GS_author_version.pdf
▪Project: https://chaphlagical.icu/Deblur-GS/
@Machine_learn
🔥6
Large Brain Model for Learning Generic Representations with Tremendous EEG Data in BCI
🖥 Github: https://github.com/935963004/labram
📕 Paper: https://arxiv.org/abs/2405.18765v1
@Machine_learn
@Machine_learn
Please open Telegram to view this post
VIEW IN TELEGRAM
🔥3👍1
This media is not supported in your browser
VIEW IN TELEGRAM
Images that Sound: Composing Images and Sounds on a Single Canvas
abs: https://arxiv.org/abs/2405.12221
project page: https://ificl.github.io/images-that-sound/
code: https://github.com/IFICL/images-that-sound
This paper introduces an inference-time procedure that generates images that are also spectrograms corresponding to the prompt. It uses a latent image and audio diffusion model with same latent space (Stable Diffusion v1.5 and Auffusion) and denoise the same latent with both.
@Machine_learn
abs: https://arxiv.org/abs/2405.12221
project page: https://ificl.github.io/images-that-sound/
code: https://github.com/IFICL/images-that-sound
This paper introduces an inference-time procedure that generates images that are also spectrograms corresponding to the prompt. It uses a latent image and audio diffusion model with same latent space (Stable Diffusion v1.5 and Auffusion) and denoise the same latent with both.
@Machine_learn
👍2
🔥🔥🔥 YOLOv10: Real-Time End-to-End Object Detection
▪Paper: arxiv.org/pdf/2405.14458
▪Github: https://github.com/THU-MIG/yolov10/
▪Demo :https://huggingface.co/spaces/kadirnar/Yolov10
▪Colab: https://colab.research.google.com/github/roboflow-ai/notebooks/blob/main/notebooks/train-yolov10-object-detection-on-custom-dataset.ipynb#scrollTo=SaKTSzSWnG7s
@machine_learn
▪Paper: arxiv.org/pdf/2405.14458
▪Github: https://github.com/THU-MIG/yolov10/
▪Demo :https://huggingface.co/spaces/kadirnar/Yolov10
▪Colab: https://colab.research.google.com/github/roboflow-ai/notebooks/blob/main/notebooks/train-yolov10-object-detection-on-custom-dataset.ipynb#scrollTo=SaKTSzSWnG7s
@machine_learn
❤4🔥3
Forwarded from Papers
با عرض سلام این مقاله رو می خواییم برای Nature بفرستیم جایگاه های ۱ تا ۴ اش خالیه از دوستان کسی نیاز داشت در خدمتیم
Title:
Detection of brain tumors from images using the UNet architecture, with a comparative analysis of transfer learning methods and CNNs.
——————————————————————--
Abstract:
Health is crucial for human life, especially brain health, which is vital for all executive functions. Diagnosing brain health issues is often done using magnetic resonance imaging (MRI) devices, which provide critical data for health decision-makers. Images from these devices serve as a significant source of big data for artificial intelligence applications. This big data facilitates high performance in image processing classification problems, a subfield of artificial intelligence. In this study, we aim to classify brain tumors such as glioma, meningioma, and pituitary tumors from brain MRI images using the UNet architecture. To compare the results and gain a better understanding, we also employed Convolutional Neural Networks (CNN) and CNN-based models like Inception-V3, EfficientNetB4, VGG19, along with transfer learning methods for classification tasks. The models were evaluated using F-score, recall, precision, and accuracy metrics. The best accuracy result was achieved with CNN-VGG16, reaching 97%. The same transfer learning model also showed an F-score of 96%, an Area Under the Curve (AUC) value of 98%, a recall value of 98%, and a precision value of 97%. The UNet architecture and CNN-based transfer learning models play a significant role in the early diagnosis and rapid treatment of brain tumors, which is vital for improving patient outcomes.
——————————————————————
Keywords:
Brain tumor detection, UNet, CNN, Transfer Learning.
——————————————————————
Journal:
Scientific Reports
@Raminmousa
@Machine_learn
@paper4money
Title:
Detection of brain tumors from images using the UNet architecture, with a comparative analysis of transfer learning methods and CNNs.
——————————————————————--
Abstract:
Health is crucial for human life, especially brain health, which is vital for all executive functions. Diagnosing brain health issues is often done using magnetic resonance imaging (MRI) devices, which provide critical data for health decision-makers. Images from these devices serve as a significant source of big data for artificial intelligence applications. This big data facilitates high performance in image processing classification problems, a subfield of artificial intelligence. In this study, we aim to classify brain tumors such as glioma, meningioma, and pituitary tumors from brain MRI images using the UNet architecture. To compare the results and gain a better understanding, we also employed Convolutional Neural Networks (CNN) and CNN-based models like Inception-V3, EfficientNetB4, VGG19, along with transfer learning methods for classification tasks. The models were evaluated using F-score, recall, precision, and accuracy metrics. The best accuracy result was achieved with CNN-VGG16, reaching 97%. The same transfer learning model also showed an F-score of 96%, an Area Under the Curve (AUC) value of 98%, a recall value of 98%, and a precision value of 97%. The UNet architecture and CNN-based transfer learning models play a significant role in the early diagnosis and rapid treatment of brain tumors, which is vital for improving patient outcomes.
——————————————————————
Keywords:
Brain tumor detection, UNet, CNN, Transfer Learning.
——————————————————————
Journal:
Scientific Reports
@Raminmousa
@Machine_learn
@paper4money
❤1
InterpreTabNet: Distilling Predictive Signals from Tabular Data by Salient Feature Interpretation
🖥 Github: https://github.com/jacobyhsi/InterpreTabNet
📕 Paper: https://arxiv.org/abs/2406.00426v1
@Machine_learn
@Machine_learn
Please open Telegram to view this post
VIEW IN TELEGRAM
👍2
سلام دوستان حداقل ماين مي كنينن NFT ماين كنين كه يه چيزي گيرتون بياد. به نظرم اساس كوين هارو بخونين بعد ماين كنين. پروژه پايين از تمامي مواردي كه فرستادين برام بهتر بوده.
https://www.tg-me.com/SpinnerCoin_bot/app?startapp=r_280673
https://www.tg-me.com/SpinnerCoin_bot/app?startapp=r_280673
Telegram
SpinnerCoin
P2E game powered by TON and based on unique NFT
👍4
This media is not supported in your browser
VIEW IN TELEGRAM
🎙 Real-time in-browser speech recognition
▪Сode: https://github.com/xenova/transformers.js/tree/v3/examples/webgpu-whisper
▪Hf: https://huggingface.co/spaces/Xenova/realtime-whisper-webgpu
@Machine_learn
▪Сode: https://github.com/xenova/transformers.js/tree/v3/examples/webgpu-whisper
▪Hf: https://huggingface.co/spaces/Xenova/realtime-whisper-webgpu
@Machine_learn
🔥6👍2
🚀 AgentGym: Evolving Large Language Model-based Agents across Diverse Environments
🖥 Github: https://github.com/woooodyy/agentgym
📕 Paper: https://arxiv.org/abs/2406.04151v1
🔥Project: https://agentgym.github.io/
⚡️Model (AgentEvol-7B): https://huggingface.co/AgentGym/AgentEvol-7B
@Machine_learn
🔥Project: https://agentgym.github.io/
⚡️Model (AgentEvol-7B): https://huggingface.co/AgentGym/AgentEvol-7B
@Machine_learn
Please open Telegram to view this post
VIEW IN TELEGRAM
👍3❤1
▪Github: https://github.com/IntelLabs/MMPano
▪Paper: https://arxiv.org/abs/2406.01843
▪Project: https://zhipengcai.github.io/MMPano/
▪Video: https://youtu.be/XDMNEzH4-Ec
@Machine_learn
Please open Telegram to view this post
VIEW IN TELEGRAM
👍3
https://www.tg-me.com/SpinnerCoin_bot/app?startapp=r_280673
تنها پروژه اي كه اين روزا از نظرم اهميت داره Spinner هستش كه به ماين NFT مي پردازه.
💎 +750 $SPN as a first-time gift
تنها پروژه اي كه اين روزا از نظرم اهميت داره Spinner هستش كه به ماين NFT مي پردازه.
💎 +750 $SPN as a first-time gift
Telegram
SpinnerCoin
P2E game powered by TON and based on unique NFT
🚀 Diving into Underwater: Segment Anything Model Guided Underwater Salient Instance Segmentation and A Large-scale Dataset
🖥 Github: https://github.com/liamlian0727/usis10k
📕 Paper: https://arxiv.org/abs/2406.06039v1
@Machine_learn
@Machine_learn
Please open Telegram to view this post
VIEW IN TELEGRAM
👍3
Media is too big
VIEW IN TELEGRAM
🔈 Separating the "Chirp" from the "Chat": Self-supervised Visual Grounding of Sound and Language
▪Paper: https://arxiv.org/abs/2406.05629
▪Website: https://mhamilton.net/denseav
▪Code: https://github.com/mhamilton723/DenseAV
▪Video: https://youtu.be/wrsxsKG-4eE
@Machine_learn
▪Paper: https://arxiv.org/abs/2406.05629
▪Website: https://mhamilton.net/denseav
▪Code: https://github.com/mhamilton723/DenseAV
▪Video: https://youtu.be/wrsxsKG-4eE
@Machine_learn
—
pip install semantic-kernel
@Machine_learn
Please open Telegram to view this post
VIEW IN TELEGRAM
👍6
🔥 Astrologers have announced a week of video generation models!
Following the hype around the Kling, Luma and Runway models, a new open source version of Open-Sora has been released.
Open-Sora 1.2 from Hpcoretech has been published on huggingface.
Basic moments:
The new 1.1B model is trained on 20M videos and generates videos up to 14 seconds long at 720p resolution.
▪Diffusion Model: https://huggingface.co/hpcai-tech/OpenSora-STDiT-v3
▪VAE model: https://huggingface.co/hpcai-tech/OpenSora-VAE-v1.2
▪Technical report: https://github.com/hpcaitech/Open-Sora/blob/main/docs/report_03.md
▪Demo: https://huggingface.co/spaces/hpcai-tech/open-sora
@Machine_learn
Following the hype around the Kling, Luma and Runway models, a new open source version of Open-Sora has been released.
Open-Sora 1.2 from Hpcoretech has been published on huggingface.
Basic moments:
The new 1.1B model is trained on 20M videos and generates videos up to 14 seconds long at 720p resolution.
▪Diffusion Model: https://huggingface.co/hpcai-tech/OpenSora-STDiT-v3
▪VAE model: https://huggingface.co/hpcai-tech/OpenSora-VAE-v1.2
▪Technical report: https://github.com/hpcaitech/Open-Sora/blob/main/docs/report_03.md
▪Demo: https://huggingface.co/spaces/hpcai-tech/open-sora
@Machine_learn
👍4