Machine Learning The Hard Way
https://www.youtube.com/watch?v=xeAB10QgDW8
- time(working with data) > time(optimizing algorithms)
- understanding your data. garbage in, garbage out.
- formatting your data. munging. (numpy, scikit-learn)
- writing your own solution is a good indicator that you don't know what you're doing.
- pre-processing your data. normalize.
- know your goal. gap between real probability and odds implied probability.
- choosing an algorithm.
- read the algorithm docs.
- testing results because you're probably wrong. (cross_validation, shuffle data)
- interpreting results. don't bet the house.
- share your knowledge. you know more than you think.
- keep learning. you know less than you think.