Skip to content

Latest commit

 

History

History
133 lines (110 loc) · 9.74 KB

scott-davis-transcript.md

File metadata and controls

133 lines (110 loc) · 9.74 KB

Scott Davis Transcript

Open Source Data Science Masters


I'm going to have some time for indepedent study this year so I plan to work through as much as possible. I work in the real estate industry and we have so much data that isn't used for meaningful analysis and the tools, though readily available, haven't caught up for the needs of real estate users. That's what I'm interested in working on. I use a lot of GIS and R, so my curriculum is tailored to follow R/Python and QGIS. I'm a bit of an open-source nut so I like learning much better this way. I'm looking for people to connect with, and possibly to work on projects.

Want to collaborate? Get in touch:

Open Source Curriculum

Base Introduction

Data Science Introductions - [ ] Intro to Data Science by UW / Coursera, online course - [ ] Data Science Specialization by Johns Hopkins / Coursera - [X] [Data Scientists Toolbox](https://www.coursera.org/account/accomplishments/certificate/UY4EBM46HL) - [X] [R Programming](https://www.coursera.org/account/accomplishments/records/Va5vuEvGKyr7UyHEL) - [X] [Getting and Cleaning Data](https://www.coursera.org/account/accomplishments/records/ENSGmvNfx24sANRW) - [X] [Exploratory Data Analysis](https://www.coursera.org/account/accomplishments/records/2PPsRu2Us3sUehBQ) - [X] [Reproducible Research] - [ ] [Statistical Inference] (in progress) - [ ] [Regression Models] (in progress) - [X] [Practical Machine Learning] - [ ] [Developing Data Products] - [ ] [Data Science Capstone] - [ ] [Data Science by Harvard](http://cs109.github.io/2015/) (online course) - [ ] [Data Science with Open Source Tools](http://shop.oreilly.com/product/9780596802363.do) - [50 Years of Data Science](http://pages.cs.wisc.edu/~anhai/courses/784-fall15/50YearsDataScience.pdf) - [ ] [Datasmart](http://www.amazon.com/Data-Smart-Science-Transform-Information/dp/111866146X/ref=sr_1_1?s=books&ie=UTF8&qid=1458768727&sr=1-1&keywords=datasmart) - in Excel, but also works in LibreOffice and so much of business analytics is still in Excel.

Mathematics/Statistics

- [ ] [Statistics for Spatial Data, Revised Edition](http://www.wiley.com/WileyCDA/WileyTitle/productCd-1119114616.html) - [ ] [Statistics for Spatio-Temporal Data](http://www.wiley.com/WileyCDA/WileyTitle/productCd-EHEP002348.html) - [ ] [Linear Algebra](http://www.amazon.com/Linear-Algebra-Dover-Books-Mathematics/dp/048663518X) - [ ] Problem-Solving Heuristics: [How to Solve It](http://www.amazon.com/How-Solve-It-Mathematical-Princeton/dp/069111966X)

Computing

R: - [ ] [R in Action](https://www.manning.com/books/r-in-action-second-edition?a_bid=5c2b1e1d&a_aid=RiA2ed) - [ ] [R Cookbook](http://shop.oreilly.com/product/9780596809164.do) - [ ] [Forecasting: Principles and Practice](http://otexts.com/fpp/)

R Libraries/Task Views

Python:

QGIS:

MySQL:

Octave:

PostGIS/PostGRESQL:

Algorithms

- [ ] [Algorithms Design & Analysis](http://openclassroom.stanford.edu/MainFolder/CoursePage.php?course=IntroToAlgorithms) Stanford openclassroom

Distributed Computing Paradigms

- [ ] Intro to Hadoop and MapReduce by Cloudera and Udacity *Note: I might swap the above course with an EdX course on Apache Spark and distributed computing*

Data Mining

- [ ] Mining Massive Data Sets, by Stanford and Coursera - [ ] [Clean Data](https://www.packtpub.com/big-data-and-business-intelligence/clean-data)

Machine Learning/Predictive Analytics - Foundational/Theoretical/Practical

- [ ] Machine Learning, by Ng Stanford and Coursera (NB this class requires a lot of higher level math) - [ ] [An Introduction to Statistical Learning with Applications in R](http://www.r-bloggers.com/in-depth-introduction-to-machine-learning-in-15-hours-of-expert-videos/) (by the authors of The Elements of Statistical Learning at Stanford.) - [ ] [Machine Learning with R](https://www.packtpub.com/big-data-and-business-intelligence/machine-learning-r-second-edition) - [ ] [Building a Recommendation System in R](https://www.packtpub.com/big-data-and-business-intelligence/building-recommendation-system-r) - [ ] [Mastering Predictive Analytics in R](https://www.packtpub.com/application-development/mastering-predictive-analytics-r) - [ ] [Bootstrapping Machine Learning](http://www.louisdorard.com/machine-learning-book/) - [ ] [Applied Predictive Modeling](http://www.amazon.com/gp/product/1461468485?psc=1&redirect=true&ref_=oh_aui_detailpage_o08_s00)

Analysis

- [ ] [Practical Data Science Cookbook](http://www.diveintopython.net/) - [ ] [R Data Analysis Cookbook](code.google.com/edu/languages/google-python-class/)

Spatial Analysis

- [ ] [An Introduction to R for Spatial Analysis and Mapping](https://us.sagepub.com/en-us/nam/an-introduction-to-r-for-spatial-analysis-and-mapping/book241031) - [ ] [Applied Spatial Data Analysis with R](http://www.springer.com/us/book/9781461476177)

Land Use/Transport/Gravity Modeling

- [ ] [Integrated Land Use and Transport Modelling: Decision Chains and Hierarchies](http://www.amazon.com/gp/product/0521022177?psc=1&redirect=true&ref_=oh_aui_detailpage_o03_s00) - [ ] [Gravity and Spatial Interaction Models (Scientific Geography Series)](http://www.amazon.com/gp/product/0803925441?psc=1&redirect=true&ref_=oh_aui_detailpage_o06_s00) - [ ] [TRANUS Model](http://www.tranus.com/tranus-english) - [ ] [Urban Sim](https://pypi.python.org/pypi/urbansim) - [ ] [Huff-tools Package in R](http://rstudio-pubs-static.s3.amazonaws.com/42357_1e6fcc5bcfec439096eb86a106ebf22e.html) -

Data Design/Data Viz

- [ ] [Beautiful Evidence](http://www.edwardtufte.com/tufte/books_be) - [ ] [Semiology of Graphics](http://www.amazon.com/Semiology-Graphics-Diagrams-Networks-Maps/dp/1589482611) - [ ] [Visual Complexity Mapping Patterns of Information](hhttp://www.visualcomplexity.com/vc/book/) - [ ] [The Visual Display of Quantitative Information](http://www.edwardtufte.com/tufte/books_vdqi) - [ ] [Design for Information](http://isabelmeirelles.com/book-design-for-information/) - [ ] [Design Elements: A Graphical Style Manual](http://www.amazon.com/Design-Elements-Graphic-Style-Manual/dp/1592532616) - [ ] [Storytelling with Data](http://www.amazon.com/gp/product/1119002257?psc=1&redirect=true&ref_=oh_aui_detailpage_o09_s00) - [ ] [Mastering Python Data Visualization](https://www.packtpub.com/big-data-and-business-intelligence/mastering-python-data-visualization) - [ ] [The Grammar of Graphics](https://www.packtpub.com/big-data-and-business-intelligence/mastering-python-data-visualization) - [ ] [R Graphics Cookbook](http://shop.oreilly.com/product/9780596809164.do)

Relevant prior studies

- [X] MS in Community and Regional Planning, UT-Austin - [X] BA in Liberal Arts, concentration in geography, UT-Austin

OpenSource Data Science Masters Capstone Project

I'm interesting in using data science approaches for better intelligence behind real estate decisions, specifically evaluating population growth, transactions and location decisions. I'd also like to evaluate statistical learning technqiues to make better pricing decisions. Finally, I'd like to develop a model to optimize real estate portfolios.

If you'd like to pair up for the capstone, let me know