CoSTI: Consistency Models for (a faster) Spatio-Temporal Imputation
31 Jan 2025 · Javier Solís-García, Belén Vega-Márquez, Juan A. Nepomuceno, Isabel A. Nepomuceno-Chamorro ·
Multivariate Time Series Imputation (MTSI) is crucial for many applications, such as healthcare monitoring and traffic management, where incomplete data can compromise decision-making. Existing state-of-the-art methods, like Denoising Diffusion Probabilistic Models (DDPMs), achieve high imputation accuracy; however, they suffer from significant computational costs and are notably time-consuming due to their iterative nature. In this work, we propose CoSTI, an innovative adaptation of Consistency Models (CMs) for the MTSI domain. CoSTI employs Consistency Training to achieve comparable imputation quality to DDPMs while drastically reducing inference times, making it more suitable for real-time applications. We evaluate CoSTI across multiple datasets and missing data scenarios, demonstrating up to a 98% reduction in imputation time with performance on par with diffusion-based models. This work bridges the gap between efficiency and accuracy in generative imputation tasks, providing a scalable solution for handling missing data in critical spatio-temporal systems.
Paper: https://arxiv.org/pdf/2501.19364v1.pdf
Code: https://github.com/javiersgjavi/costi
@Machine_learn
31 Jan 2025 · Javier Solís-García, Belén Vega-Márquez, Juan A. Nepomuceno, Isabel A. Nepomuceno-Chamorro ·
Multivariate Time Series Imputation (MTSI) is crucial for many applications, such as healthcare monitoring and traffic management, where incomplete data can compromise decision-making. Existing state-of-the-art methods, like Denoising Diffusion Probabilistic Models (DDPMs), achieve high imputation accuracy; however, they suffer from significant computational costs and are notably time-consuming due to their iterative nature. In this work, we propose CoSTI, an innovative adaptation of Consistency Models (CMs) for the MTSI domain. CoSTI employs Consistency Training to achieve comparable imputation quality to DDPMs while drastically reducing inference times, making it more suitable for real-time applications. We evaluate CoSTI across multiple datasets and missing data scenarios, demonstrating up to a 98% reduction in imputation time with performance on par with diffusion-based models. This work bridges the gap between efficiency and accuracy in generative imputation tasks, providing a scalable solution for handling missing data in critical spatio-temporal systems.
Paper: https://arxiv.org/pdf/2501.19364v1.pdf
Code: https://github.com/javiersgjavi/costi
@Machine_learn
Please open Telegram to view this post
VIEW IN TELEGRAM
Practical Statistics for Data Scientists.pdf
16 MB
Practical Statistics for Data Scientists
50+ Essential Concepts Using R and Python
#Python #Book
@Machine_learn
50+ Essential Concepts Using R and Python
#Python #Book
@Machine_learn
با عرض سلام
تخفیف ۵۰٪ دو پکیچ یادگیری ماشین و یادگیری عمیق که شامل ۳۶ پروژه عملی در بحث پردازش تصویر و پردازش متن می باشند رو در نظر گرفتیم. دوستانی که نیاز به این دو پک دارند می تونن به بنده پیام بدن. ۱ ماه مشاوره ریکان راجع به این پروژه ها هم خواهیم داشت.
🟥 🟥 🟥 🟥 🟥 🟥
@Raminmousa
تخفیف ۵۰٪ دو پکیچ یادگیری ماشین و یادگیری عمیق که شامل ۳۶ پروژه عملی در بحث پردازش تصویر و پردازش متن می باشند رو در نظر گرفتیم. دوستانی که نیاز به این دو پک دارند می تونن به بنده پیام بدن. ۱ ماه مشاوره ریکان راجع به این پروژه ها هم خواهیم داشت.
@Raminmousa
Please open Telegram to view this post
VIEW IN TELEGRAM
Machine learning books and papers pinned «با عرض سلام تخفیف ۵۰٪ دو پکیچ یادگیری ماشین و یادگیری عمیق که شامل ۳۶ پروژه عملی در بحث پردازش تصویر و پردازش متن می باشند رو در نظر گرفتیم. دوستانی که نیاز به این دو پک دارند می تونن به بنده پیام بدن. ۱ ماه مشاوره ریکان راجع به این پروژه ها هم خواهیم داشت.…»
Microsoft just updated their blog with 300 examples of real-world AI use cases.
📕 Article
@Machine_learn
📕 Article
@Machine_learn
Forwarded from Github LLMs
DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding
13 Dec 2024 · Zhiyu Wu, Xiaokang Chen, Zizheng Pan, Xingchao Liu, Wen Liu, Damai Dai, Huazuo Gao, Yiyang Ma, Chengyue Wu, Bingxuan Wang, Zhenda Xie, Yu Wu, Kai Hu, Jiawei Wang, Yaofeng Sun, Yukun Li, Yishi Piao, Kang Guan, Aixin Liu, Xin Xie, Yuxiang You, Kai Dong, Xingkai Yu, Haowei Zhang, Liang Zhao, Yisong Wang, Chong Ruan ·
We present DeepSeek-VL2, an advanced series of large Mixture-of-Experts (MoE) Vision-Language Models that significantly improves upon its predecessor, DeepSeek-VL, through two key major upgrades. For the vision component, we incorporate a dynamic tiling vision encoding strategy designed for processing high-resolution images with different aspect ratios. For the language component, we leverage #DeepSeekMoE models with the Multi-head Latent Attention mechanism, which compresses Key-Value cache into latent vectors, to enable efficient inference and high throughput. Trained on an improved vision-language dataset, DeepSeek-VL2 demonstrates superior capabilities across various tasks, including but not limited to visual question answering, optical character recognition, document/table/chart understanding, and visual grounding. Our model series is composed of three variants: DeepSeek-VL2-Tiny, #DeepSeek-VL2-Small and DeepSeek-VL2, with 1.0B, 2.8B and 4.5B activated parameters respectively. DeepSeek-VL2 achieves competitive or state-of-the-art performance with similar or fewer activated parameters compared to existing open-source dense and MoE-based models. Codes and pre-trained models are publicly accessible at https://github.com/deepseek-ai/DeepSeek-VL2.
Paper: https://arxiv.org/pdf/2412.10302v1.pdf
Code: https://github.com/deepseek-ai/deepseek-vl2
Datasets: RefCOCO TextVQA MMBench
DocVQA
💠
https://www.tg-me.com/deep_learning_proj
13 Dec 2024 · Zhiyu Wu, Xiaokang Chen, Zizheng Pan, Xingchao Liu, Wen Liu, Damai Dai, Huazuo Gao, Yiyang Ma, Chengyue Wu, Bingxuan Wang, Zhenda Xie, Yu Wu, Kai Hu, Jiawei Wang, Yaofeng Sun, Yukun Li, Yishi Piao, Kang Guan, Aixin Liu, Xin Xie, Yuxiang You, Kai Dong, Xingkai Yu, Haowei Zhang, Liang Zhao, Yisong Wang, Chong Ruan ·
We present DeepSeek-VL2, an advanced series of large Mixture-of-Experts (MoE) Vision-Language Models that significantly improves upon its predecessor, DeepSeek-VL, through two key major upgrades. For the vision component, we incorporate a dynamic tiling vision encoding strategy designed for processing high-resolution images with different aspect ratios. For the language component, we leverage #DeepSeekMoE models with the Multi-head Latent Attention mechanism, which compresses Key-Value cache into latent vectors, to enable efficient inference and high throughput. Trained on an improved vision-language dataset, DeepSeek-VL2 demonstrates superior capabilities across various tasks, including but not limited to visual question answering, optical character recognition, document/table/chart understanding, and visual grounding. Our model series is composed of three variants: DeepSeek-VL2-Tiny, #DeepSeek-VL2-Small and DeepSeek-VL2, with 1.0B, 2.8B and 4.5B activated parameters respectively. DeepSeek-VL2 achieves competitive or state-of-the-art performance with similar or fewer activated parameters compared to existing open-source dense and MoE-based models. Codes and pre-trained models are publicly accessible at https://github.com/deepseek-ai/DeepSeek-VL2.
Paper: https://arxiv.org/pdf/2412.10302v1.pdf
Code: https://github.com/deepseek-ai/deepseek-vl2
Datasets: RefCOCO TextVQA MMBench
DocVQA
https://www.tg-me.com/deep_learning_proj
Please open Telegram to view this post
VIEW IN TELEGRAM
Forwarded from Papers
با عرض سلام
نفر سوم از مقاله زیر رو جهت همکاری نیاز داریم. این مقاله ۸ ماه داریم روش کار میکنیم.
Title: Gaussian Mixture latent for Recurrent Neural Networks Basic deficiencies
The problem of time series prediction analyzes patterns in past data to predict the future. Traditional machine learning algorithms, despite achieving impressive results, require manual feature selection. Automatic feature selection along with the addition of time concept in deep recurrent networks has led to the provision of more suitable solutions. The selection of feature order in deep recurrent networks leads to the provision of different results due to the use of Back-propagation. The problem of selecting feature order is an NP-complete problem. In this research, the aim is to provide a solution to improve this problem.......!
هزینه مشارکت نفر سوم این مقاله ۱۰۰۰$ و ژورنال هدف Expert system میباشد.
@Raminmousa
@Machine_learn
نفر سوم از مقاله زیر رو جهت همکاری نیاز داریم. این مقاله ۸ ماه داریم روش کار میکنیم.
Title: Gaussian Mixture latent for Recurrent Neural Networks Basic deficiencies
The problem of time series prediction analyzes patterns in past data to predict the future. Traditional machine learning algorithms, despite achieving impressive results, require manual feature selection. Automatic feature selection along with the addition of time concept in deep recurrent networks has led to the provision of more suitable solutions. The selection of feature order in deep recurrent networks leads to the provision of different results due to the use of Back-propagation. The problem of selecting feature order is an NP-complete problem. In this research, the aim is to provide a solution to improve this problem.......!
هزینه مشارکت نفر سوم این مقاله ۱۰۰۰$ و ژورنال هدف Expert system میباشد.
@Raminmousa
@Machine_learn
CycleGuardian: A Framework for Automatic RespiratorySound classification Based on Improved Deep clustering and Contrastive Learning
🖥 Github: https://github.com/chumingqian/CycleGuardian
📕 Paper: https://arxiv.org/abs/2502.00734v1
🌟 Dataset: https://paperswithcode.com/dataset/icbhi-respiratory-sound-database
@Machine_learn
🌟 Dataset: https://paperswithcode.com/dataset/icbhi-respiratory-sound-database
@Machine_learn
Please open Telegram to view this post
VIEW IN TELEGRAM
Forwarded from Papers
با عرض سلام بسیاری از دوستان که می خواهند مقاله شروع کنند نیاز به نقش راهی برای شروع دارند. از این رو سعی داریم جهت مشاوره در موضوعات زیر همکاری داشته باشیم. انتخاب موضوع، انتخاب ایده، بررسی ساختار کلی مقاله و انتخاب ژورنال با بنده خواهد بود و هر هفته یک جلسه جهت بررسی کارهای انجام شده خواهیم داشت. هزینه مشاوره در هر موضوع ۵ تومن می باشد.
💠 Medical Image
1-alzheimer disease classification
2-Wound image classification
3- skin cancer classification
4- breast cancer segmentation
💠 Time series:
1- crypro market price prediction: High dimensional features
2-crypto market illiquidity prediction: High dimensional features
3-Air quality prediction
4-Network traffic prediction
5-Malware detection
💠 Text mining
1-Large languge model: systmatic survey
2-multi-domain sentiment analysis
3- Extracting psychiatric stressors for suicide from twitter
جهت مشارکت می تونین با ایدی بنده در ارتباط باشین.
@Raminmousa
1-alzheimer disease classification
2-Wound image classification
3- skin cancer classification
4- breast cancer segmentation
1- crypro market price prediction: High dimensional features
2-crypto market illiquidity prediction: High dimensional features
3-Air quality prediction
4-Network traffic prediction
5-Malware detection
1-Large languge model: systmatic survey
2-multi-domain sentiment analysis
3- Extracting psychiatric stressors for suicide from twitter
جهت مشارکت می تونین با ایدی بنده در ارتباط باشین.
@Raminmousa
Please open Telegram to view this post
VIEW IN TELEGRAM
Machine learning books and papers pinned «با عرض سلام بسیاری از دوستان که می خواهند مقاله شروع کنند نیاز به نقش راهی برای شروع دارند. از این رو سعی داریم جهت مشاوره در موضوعات زیر همکاری داشته باشیم. انتخاب موضوع، انتخاب ایده، بررسی ساختار کلی مقاله و انتخاب ژورنال با بنده خواهد بود و هر هفته یک جلسه…»
Please open Telegram to view this post
VIEW IN TELEGRAM
Detecting Backdoor Samples in Contrastive Language Image Pretraining
3 Feb 2025 · Hanxun Huang, Sarah Erfani, Yige Li, Xingjun Ma, James Bailey ·
Contrastive language-image pretraining (CLIP) has been found to be vulnerable to poisoning backdoor attacks where the adversary can achieve an almost perfect attack success rate on CLIP models by poisoning only 0.01\% of the training dataset. This raises security concerns on the current practice of pretraining large-scale models on unscrutinized web data using CLIP. In this work, we analyze the representations of backdoor-poisoned samples learned by CLIP models and find that they exhibit unique characteristics in their local subspace, i.e., their local neighborhoods are far more sparse than that of clean samples. Based on this finding, we conduct a systematic study on detecting CLIP backdoor attacks and show that these attacks can be easily and efficiently detected by traditional density ratio-based local outlier detectors, whereas existing backdoor sample detection methods fail. Our experiments also reveal that an unintentional backdoor already exists in the original CC3M dataset and has been trained into a popular open-source model released by OpenCLIP. Based on our detector, one can clean up a million-scale web dataset (e.g., CC3M) efficiently within 15 minutes using 4 Nvidia A100 GPUs.
Paper: https://arxiv.org/pdf/2502.01385v1.pdf
Code: https://github.com/HanxunH/Detect-CLIP-Backdoor-Samples
Datasets: Conceptual Captions CC12M RedCaps
@Machine_learn
3 Feb 2025 · Hanxun Huang, Sarah Erfani, Yige Li, Xingjun Ma, James Bailey ·
Contrastive language-image pretraining (CLIP) has been found to be vulnerable to poisoning backdoor attacks where the adversary can achieve an almost perfect attack success rate on CLIP models by poisoning only 0.01\% of the training dataset. This raises security concerns on the current practice of pretraining large-scale models on unscrutinized web data using CLIP. In this work, we analyze the representations of backdoor-poisoned samples learned by CLIP models and find that they exhibit unique characteristics in their local subspace, i.e., their local neighborhoods are far more sparse than that of clean samples. Based on this finding, we conduct a systematic study on detecting CLIP backdoor attacks and show that these attacks can be easily and efficiently detected by traditional density ratio-based local outlier detectors, whereas existing backdoor sample detection methods fail. Our experiments also reveal that an unintentional backdoor already exists in the original CC3M dataset and has been trained into a popular open-source model released by OpenCLIP. Based on our detector, one can clean up a million-scale web dataset (e.g., CC3M) efficiently within 15 minutes using 4 Nvidia A100 GPUs.
Paper: https://arxiv.org/pdf/2502.01385v1.pdf
Code: https://github.com/HanxunH/Detect-CLIP-Backdoor-Samples
Datasets: Conceptual Captions CC12M RedCaps
@Machine_learn
Efficient Reasoning with Hidden Thinking
Chain-of-Thought (CoT) reasoning has become a powerful framework for improving complex problem-solving capabilities in Multimodal Large Language Models (MLLMs). However, the verbose nature of textual reasoning introduces significant inefficiencies. In this work, we propose
(as hidden llama), an efficient reasoning framework that leverages reasoning CoTs at hidden latent space. We design the Heima Encoder to condense each intermediate CoT into a compact, higher-level hidden representation using a single thinking token, effectively minimizing verbosity and reducing the overall number of tokens required during the reasoning process. Meanwhile, we design corresponding Heima Decoder with traditional Large Language Models (LLMs) to adaptively interpret the hidden representations into variable-length textual sequence, reconstructing reasoning processes that closely resemble the original CoTs. Experimental results across diverse reasoning MLLM benchmarks demonstrate that Heima model achieves higher generation efficiency while maintaining or even better zero-shot task accuracy. Moreover, the effective reconstruction of multimodal reasoning processes with Heima Decoder validates both the robustness and interpretability of our approach.
Paper: https://arxiv.org/pdf/2501.19201v1.pdf
Code: https://github.com/shawnricecake/heima
Datasets: MMBench - MM-Vet - MathVista - MMStar - HallusionBench
@Machine_learn
Chain-of-Thought (CoT) reasoning has become a powerful framework for improving complex problem-solving capabilities in Multimodal Large Language Models (MLLMs). However, the verbose nature of textual reasoning introduces significant inefficiencies. In this work, we propose
(as hidden llama), an efficient reasoning framework that leverages reasoning CoTs at hidden latent space. We design the Heima Encoder to condense each intermediate CoT into a compact, higher-level hidden representation using a single thinking token, effectively minimizing verbosity and reducing the overall number of tokens required during the reasoning process. Meanwhile, we design corresponding Heima Decoder with traditional Large Language Models (LLMs) to adaptively interpret the hidden representations into variable-length textual sequence, reconstructing reasoning processes that closely resemble the original CoTs. Experimental results across diverse reasoning MLLM benchmarks demonstrate that Heima model achieves higher generation efficiency while maintaining or even better zero-shot task accuracy. Moreover, the effective reconstruction of multimodal reasoning processes with Heima Decoder validates both the robustness and interpretability of our approach.
Paper: https://arxiv.org/pdf/2501.19201v1.pdf
Code: https://github.com/shawnricecake/heima
Datasets: MMBench - MM-Vet - MathVista - MMStar - HallusionBench
@Machine_learn