Skip to content

dakshasingh/data_science

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 

Repository files navigation

data_science

Data Scientists/Big Data/Analytics/R/SAS/Python

Data scientist is a person employed to analyse and interpret complex digital data, implementing statistics in order to assist a buisness in its decission making. These persons are data wranglers, who take large amount of messy data (may be unstructed, semi-structured or structured) and use their formidable skills in maths, statistics, and programming to clean and organise the data inorder to make it fit for analysis.

In addition to possesing the "hard skills" of being quantitative and technically focused. Data Scientists also have versatile communication and collaboration skills, and an inate curiosity for exploring and experimenting with data. They also tend to be skeptical people,in that they are likely to askalot of questions aroundthe viabilityof a given solution and whether it willreally work. These behavioral traits are what seperate some one who can work with others to use data to drive change.

MACHINE LEARNING Machine learning is a field of computer science that gives computer the ability to learn without being explicitly programmed. ML and AI are two distinct but they are connected. “A computer program is said to learn from experience ‘E’, with respect to some class of Tasks ‘T’ and performance measure ‘P’ if its performance at tasks ‘T’ as measured by ‘P’ improves with experience ‘E’.” Machine learning is closely related to (and often overlaps with) computational statistics, which also focuses on prediction-making through the use of computers. It has strong ties to mathematical optimization, which delivers methods, theory and application domains to the field. Machine Learning is further classified into two categories they are:

  1. Supervised Learning – It is a task of finding a function from a labeled Data. Where labeled data is a dataset which has independent variable and dependent variables. Examples: Classification, Regression

  2. Unsupervised Learning – It is a task of exploring the data to derive some inferences or insights from the dataset. Here the Independent variable/ Target Variable is unknown. Examples: Dimension Reduction, Techniques (PCA, Factor Analysis), Clustering, Association Analysis Machine learning is sometimes conflated with data mining, where the latter subfield focuses more on exploratory data analysis and is known as unsupervised learning. Machine learning can also be unsupervised and be used to learn and establish baseline behavioral profiles for various entities and then used to find meaningful anomalies. Within the field of data analytics, machine learning is a method used to devise complex models and algorithms that lend themselves to prediction; in commercial use, this is known as predictive analytics. These analytical models allow researchers, data scientists, engineers, and analysts to "produce reliable, repeatable decisions and results" and uncover "hidden insights" through learning from historical relationships and trends in the data.

About

Data Scientists/Big Data/Analytics/R/SAS/Python

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published