- COMPUTING MACHINERY AND INTELLIGENCE - A. M. Turing - https://www.csee.umbc.edu/courses/471/papers/turing.pdf - The Imitation Game ○ Three players, A, B, C § C wants to tell the gender of A,B by teletypers § A can tell lies, B wants to help C win ○ The reader must accept it as a fact that digital computers can be constructed, and indeed have been constructed, according to the principles we have described, and that they can in fact mimic the actions of a human computer very closely. ○ Even when we consider the actual physical machines instead of the idealised machines, reasonably accurate knowledge of the state at one moment yields reasonably accurate knowledge any number of steps later. ○ This special property of digital computers, that they can mimic any discrete-state machine, is described by saying that they are universal machines. The existence of machines with this property has the important consequence that, considerations of speed apart, it is unnecessary to design various new machines to do various computing processes. They can all be done with one digital computer, suitably programmed for each case. It 'ill be seen that as a consequence of this all digital computers are in a sense equivalent. ○ Likewise according to this view the only way to know that a man thinks is to be that particular man. ○ Processes that are learnt do not produce a hundred per cent certainty of result; if they did they could not be unlearnt. ○ We can only see a short distance ahead, but we can see plenty there that needs to be done.
- https://github.com/papers-we-love/papers-we-love/blob/master/data_science/tidy_data.pdf - Hadley Wickham - Data Structure ○ A dataset is a collection of values, usually either numbers or strings ○ Every value belongs to a variable and an ovservation. § A variable contains all values that measure the same underlying attribute (height, duration..) § An observation contains all values measured on the same unit (person, day..) across attributes - In tidy data: ○ Each variable forms a column. ○ Each observation forms a row. ○ Each type of observational unit forms a table. - Example: year artist time track date week rank 2000 2 Pac 4:22 Baby Don’t Cry 2000-02-26 1 87 2000 2 Pac 4:22 Baby Don’t Cry 2000-03-04 2 82 2000 2 Pac 4:22 Baby Don’t Cry 2000-03-11 3 72 2000 2 Pac 4:22 Baby Don’t Cry 2000-03-18 4 77 2000 2 Pac 4:22 Baby Don’t Cry 2000-03-25 5 87 2000 2 Pac 4:22 Baby Don’t Cry 2000-04-01 6 94 2000 2 Pac 4:22 Baby Don’t Cry 2000-04-08 7 99 2000 2Ge+her 3:15 The Hardest Part Of ... 2000-09-02 1 91 2000 2Ge+her 3:15 The Hardest Part Of ... 2000-09-09 2 87 2000 2Ge+her 3:15 The Hardest Part Of ... 2000-09-16 3 92 2000 3 Doors Down 3:53 Kryptonite 2000-04-08 1 81 2000 3 Doors Down 3:53 Kryptonite 2000-04-15 2 70 2000 3 Doors Down 3:53 Kryptonite 2000-04-22 3 68 2000 3 Doors Down 3:53 Kryptonite 2000-04-29 4 67 2000 3 Doors Down 3:53 Kryptonite 2000-05-06 5 66
The paper is selected from http://pages.cs.wisc.edu/~remzi/Classes/739/Fall2016/.
On Designing and Deploying Internet-Scale Services - James Hamilton – Windows Live Services Platform
1. Three tenets a. Expect failutres. i. Failures may cause depedent components to fail. b. Keep things simple. i. Simple things are more easily to get it right. ii. Avoid unnecessay dependencies. iii. Simple installation. iv. Failure isolation. One server failure has no impact on other data centers. c. Automate everything. i. People make mistakes. 2. Deploy an operations-freindly service a. Overall Application Design i. system fails --> look first to operations ii. simplicity is the key to efficient operations 1) Design for failure. The entire service must be capable of surviving failure without human administrative interation. To test the failure path --> just hard-fail it. 2) Redundancy and fault recovery a) is the operations team willing and ble to bring down any server in the service ant any time without draining the work load first? b) Security threat modeling i) each possible security threat and implement enough mitigation for each c) Document all conceivable component failures modes and combinations. i) make sure that the service and continue to operate witout unacceptable loss in service quality. ii) Rare combinations of errors can become commonplace. 3) Commodity hardware slice a) large clusters of commodity servers $ << small # of large servers b) I/O is the constrain. Server performance continues to increase much faster than I/O performance -> a small server, more balanced system for the given amout of disk c) power consumption scales linearly with servers but cubically with clock frequency --> Higher performace server $$$ d) small server failure --> small overall service workload 4) Single-version software a) target a single internal deployment b) previous versions don't have to be supported for a decade i) The most economic services don't give customers control over the version they run and only one host version. One. Few UE changes Two. willingness to allow customers that need this level of control to either host internally or switch to an application service provier 5) Mullti-tenancy a) hosting all companies or end users of a service in the same service without physical isolation b) Single tenancy: segregation of groups of users in an isolated cluster 6) Quick service health check a) services version of a build verification test b) ensure that services isn't broken in any substantive way 7) Develop in the full environment a) unit testing components, and full servce with their component changes 8) zero trust of underlying components a) ssume that underlying components will fail 9) understand access patterns a) "What impacts will this feature have on the rest of the infrastructure" b) measure and validate the feature for load when live 10) Version everything a) Expect a mixed version environment b) run single version software but multiple versions will be live for production and test 11) Keep the unit/funcctional tests from the last release a) Keep n-1 version tests 12) Avoid single points of faulture a) Prefer stateless implementations. Don't affinitize requests. Static allocation is bad (example, hashing) b) Use Fine-grained partitioning (where related individual tuples (e.g., cliques of friends) are co-located together in the same partition) and don't support cross-partition operations to allow efficient scaling across many database servers. b. Automatic Management and Provisioning i. it can be hard because of human judgement needed sometimes (depedency) ii. Be restartable and redundant 1) persistent state stored redundantl iii. Support geo-distribution 1) support running across several hosing data center. iv. Automatic provisioning and installation v. Configuration and code as a unit 1) code and configuratoin as a single unit 2) operations deploys them as a unit 3) services should treat confi and code as a unit 4) audit log is required if confi change must be made in production vi. Manage server roles or personalities rather than servers vii. Multi-system failures are common viii. Recover at the service level 1) handle failures and correct errors at the service level with full context rather than in lower software levels ix. Never rely on local storage for non-recoverable information 1) duplicate all the non-ephemeral service state x. keep deployment simple 1) file copy, mi external dependencies. xi. fail services regularly 1) unwilling? c. Dependency Management i. Expect latency. calls to external components may talke long to complete. 1) set timeout 2) operational idempotency allows the restart of the requests after timeout even though those requests may have partially or even fully completed. 3) ensure all starts are reported and bond reestarts to avoid a repeatedly failing request ii. Isolate failures 1) avoid cascading failures iii. Use shipping and proven components 1) stable version of software and hardware iv. Implement inter-service monitoring and alerting 1) need to know when a dependent service is overloading v. Dependent services require the same deisng point vi. Decouple components 1) ensure that components can continue operation perhaps in a degraded mode during failureso f other components. For example, maintain a session key and refresh it every N hours d. Release Cycle and Testing i. Invest in engineering 1) Services that don't think big to start with will be scrambling to catch up later ii. Support version roll-back iii. Maintain forward and backward compatiblity 1) Changing between components are all potential risk. Don't rip out support for old file formats until there is no chance of a roll back to that old format in the future iv. Single-server deployment 1) The entire service must be easy to host on a single system --> for unit testing v. Stress test for load vi. Perform capacity and performance testing prior to new releases 1) do at service level . vii. Build and deploy shallowly and iteratively 1) get a skeletion version of the full service at the early stage viii. test with real data ix. Run system-level acceptance test 1) sanity check x. test and develop in full environments 1) use the same data collection and mining techniques used in production 3. Graceful Degradation and Admission Control a. A big red switch. i. a designed and tested action that can be taken when the service is no longer able to meet its SLA (?). ii. keepp the vital processing progressing while shedding or delaying some non-critical workload. iii. Determine what is minimally required if the sytem is in trouble and implementing and testing the option to shut of the non-essential services when that happens b. Control admission i. if the current load cannot be processed on the system, more work load --> even more and bad user experience. example: email, stop queuing -> not accept more mails into the system ii. Service premium customers over non-premium customers c. Meter admission. i. modification of the admission control point ii. be able to bring the system back up slowly. Ramp up. 1 user, 10 users, 100 users. iii. Ways to notify users
The conclusion: Spending time engineering the system at the beginning is worth it.
When I woke up this morning at 11am, I read an article that most successful businessman woke up at 4 am each morning. OMG. That must be kidding. Coders will not get to sleep until 2 am each day.
After that, I am reflecting on what should I do for my last 6 months of college with barely nothing to do. Lying on the bed, thinking about nothing and watching youtube videos are a good way to waste time. I tried it for the past week, but I cannot stand it anymore because it is so meaningless to waste time this way. You know, those videos are not really hilarious if I think through them carefully afterwards. Awkward.
Should I do something more interesting than repeatedly opening youtube? Maybe I should.
How about reading a computer science paper each day?
Seriously? I won’t insist as I know myself. I re-opened the youtube channel and watched another recommended video. HAHAHAHAHAHA. Wait. It’s not funny. I feel so shame on myself, once again.
Let me just go back here and read the papers! To motivate myself, I will post a paper review/summary once a day. It’s going to be awesome and I know it.
Here we go.