awesome-machine-learning
A curated list of awesome Machine Learning frameworks, libraries and software
Creator: Joseph Misiti
Stars ⭐️: 60.9k
Forked By: 14.3k
https://github.com/josephmisiti/awesome-machine-learning
#machine #learning
➖➖➖➖➖➖➖➖➖➖➖➖➖➖
Join @datascience_bds for more cool repositories.
*This channel belongs to @bigdataspecialist group
A curated list of awesome Machine Learning frameworks, libraries and software
Creator: Joseph Misiti
Stars ⭐️: 60.9k
Forked By: 14.3k
https://github.com/josephmisiti/awesome-machine-learning
#machine #learning
➖➖➖➖➖➖➖➖➖➖➖➖➖➖
Join @datascience_bds for more cool repositories.
*This channel belongs to @bigdataspecialist group
GitHub
GitHub - josephmisiti/awesome-machine-learning: A curated list of awesome Machine Learning frameworks, libraries and software.
A curated list of awesome Machine Learning frameworks, libraries and software. - josephmisiti/awesome-machine-learning
Intro to Data for Data Science
Learn the basics of data and how data are used in data science.
Rating ⭐️: 4.4 out 5
Students 👨🎓 : 14,652
Duration ⏰ : 1hr 1min of on-demand video
Created by 👨🏫: Matthew Renze
🔗 Course Link
#data_science #data
➖➖➖➖➖➖➖➖➖➖➖➖➖➖
👉Join @datascience_bds for more👈
Learn the basics of data and how data are used in data science.
Rating ⭐️: 4.4 out 5
Students 👨🎓 : 14,652
Duration ⏰ : 1hr 1min of on-demand video
Created by 👨🏫: Matthew Renze
🔗 Course Link
#data_science #data
➖➖➖➖➖➖➖➖➖➖➖➖➖➖
👉Join @datascience_bds for more👈
Udemy
Free Data Science Tutorial - Intro to Data for Data Science
Learn the basics of data and how data are used in data science. - Free Course
preview-9781785888922_A29007928.pdf
3.3 MB
Principles of Data Science
by Sinan
by Sinan
machine_learning_complete
A comprehensive machine learning repository containing 30+ notebooks on different concepts, algorithms and techniques.
Creator: Jean de Dieu Nyandwi
Stars ⭐️: 4.3k
Forked By: 695
https://github.com/Nyandwi/machine_learning_complete
#machine #learning
➖➖➖➖➖➖➖➖➖➖➖➖➖➖
Join @datascience_bds for more cool repositories.
*This channel belongs to @bigdataspecialist group
A comprehensive machine learning repository containing 30+ notebooks on different concepts, algorithms and techniques.
Creator: Jean de Dieu Nyandwi
Stars ⭐️: 4.3k
Forked By: 695
https://github.com/Nyandwi/machine_learning_complete
#machine #learning
➖➖➖➖➖➖➖➖➖➖➖➖➖➖
Join @datascience_bds for more cool repositories.
*This channel belongs to @bigdataspecialist group
GitHub
GitHub - Nyandwi/machine_learning_complete: A comprehensive machine learning repository containing 30+ notebooks on different concepts…
A comprehensive machine learning repository containing 30+ notebooks on different concepts, algorithms and techniques. - Nyandwi/machine_learning_complete
Introduction to Python Programming (for Data Analytics)
Learn the fundamentals of the python programming language for data analytics. Practice and solution resources included.
Rating ⭐️: 4.3 out 5
Students 👨🎓 : 10,887
Duration ⏰ : 1hr 48min of on-demand video
Created by 👨🏫: Valentine Mwangi
🔗 Course Link
#data_analytics #data #python #programming
➖➖➖➖➖➖➖➖➖➖➖➖➖➖
👉Join @datascience_bds for more👈
Learn the fundamentals of the python programming language for data analytics. Practice and solution resources included.
Rating ⭐️: 4.3 out 5
Students 👨🎓 : 10,887
Duration ⏰ : 1hr 48min of on-demand video
Created by 👨🏫: Valentine Mwangi
🔗 Course Link
#data_analytics #data #python #programming
➖➖➖➖➖➖➖➖➖➖➖➖➖➖
👉Join @datascience_bds for more👈
Udemy
Introduction to Python Programming (for Data Analytics)
Learn the fundamentals of the python programming language for data analytics. Practice and solution resources included.
introducing-data-science-machine-learning-python.pdf
14.6 MB
Introducing Data Science
by DAVY CIELEN
ARNO D. B. MEYSMAN
MOHAMED ALI
by DAVY CIELEN
ARNO D. B. MEYSMAN
MOHAMED ALI
Python Data Science Handbook
Python Data Science Handbook: full text in Jupyter Notebooks. This repository contains the entire Python Data Science Handbook, in the form of (free!) Jupyter notebooks.
Creator: Jake Vanderplas
Stars⭐️: 39k
Fork: 17.1K
Repo: https://github.com/jakevdp/PythonDataScienceHandbook
➖➖➖➖➖➖➖➖➖➖➖➖➖➖
Join @github_repositories_bds for more cool data science materials.
*This channel belongs to @bigdataspecialist group
Python Data Science Handbook: full text in Jupyter Notebooks. This repository contains the entire Python Data Science Handbook, in the form of (free!) Jupyter notebooks.
Creator: Jake Vanderplas
Stars⭐️: 39k
Fork: 17.1K
Repo: https://github.com/jakevdp/PythonDataScienceHandbook
➖➖➖➖➖➖➖➖➖➖➖➖➖➖
Join @github_repositories_bds for more cool data science materials.
*This channel belongs to @bigdataspecialist group
GitHub
GitHub - jakevdp/PythonDataScienceHandbook: Python Data Science Handbook: full text in Jupyter Notebooks
Python Data Science Handbook: full text in Jupyter Notebooks - jakevdp/PythonDataScienceHandbook
SQL for Data Analysis: Solving real-world problems with data
A simple & concise mySQL course (applicable to any SQL), perfect for data analysis, data science, business intelligence.
Rating ⭐️: 4.3 out 5
Students 👨🎓 : 47,690
Duration ⏰ : 1hr 57min of on-demand video
Created by 👨🏫: Max SQL
🔗 Course Link
#data_analytics #data #SQL #programming
➖➖➖➖➖➖➖➖➖➖➖➖➖➖
👉Join @datascience_bds for more👈
A simple & concise mySQL course (applicable to any SQL), perfect for data analysis, data science, business intelligence.
Rating ⭐️: 4.3 out 5
Students 👨🎓 : 47,690
Duration ⏰ : 1hr 57min of on-demand video
Created by 👨🏫: Max SQL
🔗 Course Link
#data_analytics #data #SQL #programming
➖➖➖➖➖➖➖➖➖➖➖➖➖➖
👉Join @datascience_bds for more👈
Udemy
Free Data Analysis Tutorial - SQL for Data Analysis: Solving real-world problems with data
A simple & concise mySQL course (applicable to any SQL), perfect for data analysis, data science, business intelligence. - Free Course
Few years ago I was learning about transformers and was writing down some notes for myself. Now I come accross those notes and decided to share some part of it here in case any of you find it useful.
Most famous transformers
1. BERT (Bidirectional Encoder Representations from Transformers): BERT is a pre-trained transformer model developed by Google. It has achieved state-of-the-art results in various NLP tasks, such as question answering, sentiment analysis, and text classification.
2. GPT (Generative Pre-trained Transformer): GPT is a series of transformer-based models developed by OpenAI. GPT-3, the most recent version, is a highly influential model known for its impressive language generation capabilities. It has been used in various creative applications, including text completion, language translation, and dialogue generation.
3. Transformer-XL: Transformer-XL is a transformer-based model developed by researchers at Google. It addresses the limitation of standard transformers by incorporating a recurrence mechanism to capture longer-term dependencies in the input sequence. It has been successful in tasks that require modeling long-range context, such as language modeling.
4. T5 (Text-to-Text Transfer Transformer): T5, developed by Google, is a versatile transformer model capable of performing a wide range of NLP tasks. It follows a "text-to-text" framework, where different tasks are cast as text generation problems. T5 has demonstrated strong performance across various benchmarks and has been widely adopted in the NLP community.
5. RoBERTa (Robustly Optimized BERT Pretraining Approach): RoBERTa is a variant of BERT developed by Facebook AI. It addresses some limitations of the original BERT model by tweaking the training setup and introducing additional data. RoBERTa has achieved improved performance on several NLP tasks, including text classification and named entity recognition.
BERT vs RoBERTa vs DistilBERT vs ALBERT
BERT - created by Google, 2018, question answering, summarization, and sequence classification, has 12 Encoders stacked, baseline to others.
RoBERTa - created by Facebook, 2019. literally same architecture as BERT, but improves on BERT by carefully and intelligently optimizing the training hyperparameters for BERT. It's trained on larger data, bigger vocabulary size and longer sentences. It overperforms BERT.
DistilBERT - created by Hugging Face, October 2019. roughly same general architecture as BERT, but smaller, only 6 Encoders. Distilbert is 40% smaller (40% less parameters) than the original BERT-base model, is 60% faster than it, and retains 95+% of its functionality.
ALBERT (A Light BERT) - published/introduced at around the same time as Distilbert. 18x less parameters than BERT, trained 1.7x faster. It doesn't have tradeoff in performance while DistilBERT has it at small extent. This comes from just the core difference in the way the Distilbert and Albert experiments are structured. Distilbert is trained in such a way to use BERT as the teacher for its training/distillation process. Albert, on the other hand, is trained from scratch like BERT. Better yet, Albert outperforms all previous models including BERT, Roberta, Distilbert, and XLNet.
Note: Training speed is not so important to end-users because all those are pre-trained transformer models. Still, in some cases we will need to fine-tune models using our own datasets, which is where speed is important. Also smaller and faster models like DistilBERT and ALBERT can be advantageous when there is not enough memory or computational power.
Most famous transformers
1. BERT (Bidirectional Encoder Representations from Transformers): BERT is a pre-trained transformer model developed by Google. It has achieved state-of-the-art results in various NLP tasks, such as question answering, sentiment analysis, and text classification.
2. GPT (Generative Pre-trained Transformer): GPT is a series of transformer-based models developed by OpenAI. GPT-3, the most recent version, is a highly influential model known for its impressive language generation capabilities. It has been used in various creative applications, including text completion, language translation, and dialogue generation.
3. Transformer-XL: Transformer-XL is a transformer-based model developed by researchers at Google. It addresses the limitation of standard transformers by incorporating a recurrence mechanism to capture longer-term dependencies in the input sequence. It has been successful in tasks that require modeling long-range context, such as language modeling.
4. T5 (Text-to-Text Transfer Transformer): T5, developed by Google, is a versatile transformer model capable of performing a wide range of NLP tasks. It follows a "text-to-text" framework, where different tasks are cast as text generation problems. T5 has demonstrated strong performance across various benchmarks and has been widely adopted in the NLP community.
5. RoBERTa (Robustly Optimized BERT Pretraining Approach): RoBERTa is a variant of BERT developed by Facebook AI. It addresses some limitations of the original BERT model by tweaking the training setup and introducing additional data. RoBERTa has achieved improved performance on several NLP tasks, including text classification and named entity recognition.
BERT vs RoBERTa vs DistilBERT vs ALBERT
BERT - created by Google, 2018, question answering, summarization, and sequence classification, has 12 Encoders stacked, baseline to others.
RoBERTa - created by Facebook, 2019. literally same architecture as BERT, but improves on BERT by carefully and intelligently optimizing the training hyperparameters for BERT. It's trained on larger data, bigger vocabulary size and longer sentences. It overperforms BERT.
DistilBERT - created by Hugging Face, October 2019. roughly same general architecture as BERT, but smaller, only 6 Encoders. Distilbert is 40% smaller (40% less parameters) than the original BERT-base model, is 60% faster than it, and retains 95+% of its functionality.
ALBERT (A Light BERT) - published/introduced at around the same time as Distilbert. 18x less parameters than BERT, trained 1.7x faster. It doesn't have tradeoff in performance while DistilBERT has it at small extent. This comes from just the core difference in the way the Distilbert and Albert experiments are structured. Distilbert is trained in such a way to use BERT as the teacher for its training/distillation process. Albert, on the other hand, is trained from scratch like BERT. Better yet, Albert outperforms all previous models including BERT, Roberta, Distilbert, and XLNet.
Note: Training speed is not so important to end-users because all those are pre-trained transformer models. Still, in some cases we will need to fine-tune models using our own datasets, which is where speed is important. Also smaller and faster models like DistilBERT and ALBERT can be advantageous when there is not enough memory or computational power.
Introduction to Datascience [R20DS501].pdf
5.3 MB
Introduction to Data Science
[R20DS501]
DIGITAL NOTES
[R20DS501]
DIGITAL NOTES
Beyond Jupyter Notebooks
Build your own Data science platform with Docker & Python
Rating ⭐️: 4.7 out 5
Students 👨🎓 : 5,018
Duration ⏰ : 1hr 26min of on-demand video
Created by 👨🏫: Joshua Görner
🔗 Course Link
#data_science #Jupyter #python #Docker
➖➖➖➖➖➖➖➖➖➖➖➖➖➖
👉Join @datascience_bds for more👈
Build your own Data science platform with Docker & Python
Rating ⭐️: 4.7 out 5
Students 👨🎓 : 5,018
Duration ⏰ : 1hr 26min of on-demand video
Created by 👨🏫: Joshua Görner
🔗 Course Link
#data_science #Jupyter #python #Docker
➖➖➖➖➖➖➖➖➖➖➖➖➖➖
👉Join @datascience_bds for more👈
Udemy
Free Docker Tutorial - Beyond Jupyter Notebooks
Build your own Data science platform with Docker & Python - Free Course