Telegram Web Link
Awesome Public Datasets for Your Projects

This contains numerous datasets ranging from :
Agriculture
Biology
Climate+Weather
Complex Networks
Computer Networks
Cyber Security
Data Challenges
Earth Science
Economics
Education
Energy
Entertainment
Finance
...
There's alot you can lay your hands on here

Stars⭐️: 48.8K
Fork: 8.7K
Repo: https://github.com/awesomedata/awesome-public-datasets

Join @datascience_bds for more cool data science materials.
*This channel belongs to @bigdataspecialist group
Machine learning for dummies
IBMs limited edition
Judith Hurwitz
Daniel Kirsch

https://www.ibm.com/downloads/cas/GB8ZMQZ3
Let's talk about some simple stat terms - mean, median and mode

Mean, median, and mode are three kinds of "averages". There are many "averages" in statistics, but these are, I think, the three most common, and are certainly the three you are most likely to encounter in your pre-statistics courses, if the topic comes up at all.

The "mean" is the "average" you're used to, where you add up all the numbers and then divide by the number of numbers.
The "median" is the "middle" value in the list of numbers. To find the median, your numbers have to be listed in numerical order from smallest to largest, so you may have to rewrite your list before you can find the median.
The "mode" is the value that occurs most often. If no number in the list is repeated, then there is no mode for the list.

Task:
Find the mean, median, mode, and range for the following list of values:
13, 18, 13, 14, 13, 16, 14, 21, 13

Solution:
mean: 15
median: 14
mode: 13

Explanation:
The mean is the usual average, so I'll add and then divide:
(13 + 18 + 13 + 14 + 13 + 16 + 14 + 21 + 13) ÷ 9 = 15

The median is the middle value, so first I'll have to rewrite the list in numerical order:
13, 13, 13, 13, 14, 14, 16, 18, 21
There are nine numbers in the list, so the middle one will be the (9 + 1) ÷ 2 = 10 ÷ 2 = 5th number: 14

The mode is the number that is repeated more often than any other, so 13 is the mode.
UDEMY FREE DATA MANIPULATION AND DEEP LEARNING COURSES

1) Data Manipulation in Python: Master Python, Numpy & Pandas

Rating ⭐️: 4.3 out of 5
Students 👨‍🏫: 80,451
Created by: Meta Brains

🔗 Course link


2) Python for Deep Learning: Build Neural Networks in Python

Rating ⭐️: 4.2 out of 5
Students 👨‍🏫: 44,128
Created by: Meta Brains

🔗 Course link

Note: Free coupon is inserted in URL. Courses are FREE FOR 3 DAYS

#python #datanalysis #datascience #deeplearing #numpy #pandas

Join @datascience_bds for more cool data science materials.
*This channel belongs to @bigdataspecialist group
How to choose chart for data visualization?
Data Preprocessing: Understanding and Detecting Outliers

Here's a guide to understanding, detecting and handling outliers👀.
I hope you gain the confidence you need to handle them😁

Outlier Detection and Analysis Methods
Link: Click Me 😌

Detecting and Treating Outliers | Treating the odd one out!
Link: Click Me 😌

Python Treatment for Outliers in Data Science
Link: Click Me 😌

Why You Shouldn’t Just Delete Outliers
Link: Click Me😌


Join @datascience_bds for more cool data science materials.
*This channel belongs to @bigdataspecialist group
Forwarded from Cool GitHub repositories
labmlai/annotated_deep_learning_paper_implementatios
This is a collection of simple PyTorch implementations of neural networks and related algorithms. These implementations are documented with explanations

Creator: labml.ai
Stars ⭐️: 7.8k
Forked By: 703
GithubRepo: https://github.com/labmlai/annotated_deep_learning_paper_implementations

Join @github_repositories_bds for more cool repositories.
*This channel belongs to @bigdataspecialist group
A LITTLE GUIDE TO HANDLING MISSING DATA
Having any Feature missing more than 5-10% of its values? you should consider it to be missing data or feature with high absence rate👀

How can you handle these missing values, ensuring you dont loose important part of your data🤷‍♀️
Not a problem😌. Here are important facts you must know😉

✍️Instances with missing values for all features should be eliminated
✍️Features with high absence rate should either be eliminated or filled with values
✍️Missing values can be replaced using Mean Imputation or Regression Imputation
✍️ Be careful with mean imputation for it may introduce bias as it evens out all instances
✍️Regression Imputation might overfit your model
✍️Mean and Regression Imputation can't be applied to Text features with missing values
✍️Text Features with missing values can be eliminated if not needed in data
✍️Important Text Features with Missing values can be replaced with a new class or category labelled as uncategorized
Forwarded from Free programming books
Please open Telegram to view this post
VIEW IN TELEGRAM
Important Methods in Pandas
UDEMY FREE DEEP LEARNING COURSE

Python for Deep Learning: Build Neural Networks in Python

Rating ⭐️: 4.2 out of 5
Students 👨‍🏫: 44,894
Created by: Meta Brains

🔗 Course link

Note: Free coupon is inserted in URL. Courses are FREE FOR FIRST 1000 enrollments

#python #datanalysis #datascience #deeplearing

Join @datascience_bds for more cool data science materials.
*This channel belongs to @bigdataspecialist group
Artificial Neural Networks (ANN) with Keras in Python and R

Rating ⭐️: 4.7 out of 5
Duration : 11 hours on-demand video
Students 👨‍🏫: 143,495
Created by: Start-Tech Academy

🔗 Course link

Note: Free coupon is inserted in URL. Courses are FREE FOR FIRST 1000 enrollments

#ai #ml #neural_networks #machine_learning #data_science #deep_learning

Join @datascience_bds for more cool data science materials.
*This channel belongs to @bigdataspecialist group
microsoft/Data-Science-For-Beginners
Azure Cloud Advocates at Microsoft are pleased to offer a 10-week, 20-lesson curriculum all about Data Science. Each lesson includes pre-lesson and post-lesson quizzes, written instructions to complete the lesson, a solution, and an assignment. Our project-based pedagogy allows you to learn while building, a proven way for new skills to 'stick'.

Creator: Microsoft
Stars ⭐️: 11.1k
Forked By: 1.9k
GithubRepo: https://github.com/microsoft/Data-Science-For-Beginners

Join @github_repositories_bds for more cool repositories.
*This channel belongs to @bigdataspecialist group
Understanding the Three Regression Types
Hyatt_Saleh_The_Machine_Learning_Workshop_Second_Edition_Get_ready.pdf
6.3 MB
The Machine Learning Workshop

Get ready to develop your own high-performance
machine learning algorithms with scikit-learn

Author: Hyatt Saleh
Pages: 285
Pandas_Cheat_Sheet.pdf
387.2 KB
THE PANDAS CHEAT SHEET
A well detailed guide to data wrangling using pandas
Reasons Why Data Goes Missing
Understanding the reason for the missing data in your dataset is important because it helps you determine the type of missing data and what you need to do about it. Lets get our brain to grasp this concept shall we?😁😁
Missing Completely at Random(MCAR): This is a fact that a certain missing value has nothing to do with its hypothetical value and values of other variables. eg:
You collect data on end-of-year holiday spending patterns. You survey adults on how much they spend annually on gifts for family and friends in dollar amounts.
You note that there are a few missing values in your holiday spending dataset. Some people started answering your survey but dropped out or skipped a question.
However, you note that you have data points from a wide distribution, ranging from low to high values.
Therefore, you conclude that the missing values aren’t related to any specific holiday spending amount range.

Missing at Random(MAR):This means that the propensity for a data point to be missing is unrelated to the missing data but related to some observed data. eg:
You repeat your data collection with a new group. You notice that there are more missing values for adults aged 18–25 than for other age groups.
But looking at the observed data for adults aged 18–25, you notice that the values are widely spread. It’s unlikely that the missing data are missing because of the specific values themselves.
Instead, some younger adults may be less inclined to reveal their holiday spending amounts for unrelated reasons (e.g., more protective of their privacy).

Missing Not at Random(MNAR): This is data that is neither MAR nor MCAR (i.e. the value of the variable that's missing is related to the reason it's missing). eg:
If some participants with low incomes avoid reporting their holiday spending amounts because they are low in your datast, then this is a MNAR problem
2024/10/04 03:20:48
Back to Top
HTML Embed Code: