Scaling lessons learned at Dropbox
http://eranki.tumblr.com/post/27076431887/scaling-lessons-learned-at-dropbox-part-1
For most of that time we had one to three people working on the backend. Here are some suggestions on scaling, particularly in a resource-constrained, fast-growing environment that can’t always afford to do things “the right way” (i.e., any real-world engineering project ;-). (一些简单但是非常实用的建议,适合资源受限但是快速发展的团队)
- Run with extra load. # 制造额外压力. 以便出现真正压力时可以关闭这部分压力来临时应对。
- App-specific metrics. # 收集应用相关指标来方便定位问题(添加指标只需要一行代码)
- Poor man’s analytics with bash. # 用bash来做简单的日志分析
- Log spam is really helpful. # 分析垃圾消息
- If something can fail, make sure it does. # 确保设计的failover机制可以正常工作。
- Run infrequent stuff more often in general. # 可以认为是上一个tip的扩展。
- Try to keep things homogeneous. # 尽可能保持所有东西都是同构的,这样设计实现和部署都会简单许多。
- Keeping a downtime log. # 记录down时间和原因等,可以促使自己思考来如何改进。
- UTC. # 内部时间全部使用UTC,只有在最后对外显示时转换成为当地时间。
- Technologies we used. My only suggestion for choosing technology would be to pick lightweight things that are known to work and see a lot of use outside your company, or else be prepared to become the “primary contributor” to the project. # 选择那些被许多公司采用的轻量软件,否则很可能要花大量时间维护这些软件
- Simulate/analyze things before trying them. # 先进行小范围实现分析
- The security-convenience tradeoff.