AIT
Questions about the lecture 'Advanced Internet Technology' of the RWTH Aachen
Questions about the lecture 'Advanced Internet Technology' of the RWTH Aachen
Kartei Details
Karten | 236 |
---|---|
Sprache | English |
Kategorie | Informatik |
Stufe | Universität |
Erstellt / Aktualisiert | 05.02.2017 / 10.10.2017 |
Weblink |
https://card2brain.ch/box/20170205_ait_chapter_overview
|
Einbinden |
<iframe src="https://card2brain.ch/box/20170205_ait_chapter_overview/embed" width="780" height="150" scrolling="no" frameborder="0"></iframe>
|
What are the characteristics of delete of cassandra?
Not right away add tombstone to log and delete later
What are the characteristics of read of cassandra? [4]
1. As write
2. Contact closest replica
3. Check consistency and initiate read-repair
4. Slower than write
How does the workload looks like? [3]
1. Large and unstructured data
2. Random reads and writes
3. Five needs
Which needs exist in the workload? [4]
1. Scalability // Scale out ad no scale up
2. Speed
3. Fault-tolerance
4. Low total cost of ownership (TCO)
What is the definition of scale up in the needs? [4]
1. Make bigger
2. Classical enterprise settings
3. Flexible ACID transactions
4. Transactions in a single node
What is the definition of scale out in the needs? [4]
1. Make more
2. Cloud friendly (key-value stores)
2. Queries at a single server // Limited functionality
3. No multi-row transactions
What is the CAP theorem by Brewer?
In distributed system one can satisfy at most 2/3 properties
Which are the components of the CAP theorem? [3]
1. Consistency means same data at any time
2. Availability means allows operations any time
3. Partition-tolerance means continuous work in spite of partitions
Which approaches exist for adding new nodes in cloud computing? [2]
1. Randomly by system
2. Manually by administrator
How to operate on distributed data?
Parallelize and process directly at location
How much faster are parallel programs by Amdahl’s law?
Speedup of parallelizable portion p is S=1/((1-p) + p/n)
What are the restrictions of Amdahl’s law speedup?
S is only upper bound due to overhead and workload imbalance
What are possible improvements of Amdahl’s law speedup? [3]
1. Maximize portion p
2. Balance workload
3. Minimize overhead
What are the characteristics of request-level parallelism (RLP)? [2]
1. Independent requests // Rarely connections across requests
2. Little read-write (so-called producer-consumer) sharing
What is a possible request-level parallelism (RLP) application?
Goolge’s Query-Serving Architecture
What are the characteristics of Goolge’s Query-Serving Architecture? [2]
1. Random distribution of data
2. Many redundant replicas used for load balance
How does the algorithm of Google’s Query-Serving Architecture looks like? [7]
1. Request to next Warehouse Scale Computer (WSC)
2. Load balancer redirects to cluster within WSC
3. Forward to Google Web Servers (GWS) handling request
4. Ask index server for relevant documents
5. Return relevant document list
6.1 Get indexed documents with IDs
6.2 Ad system and images
7.1 Order resulting documents
7.2 Decorate with sponsored links
What are the characteristics of data-level parallelism (DLP)? [2]
1. Compute largely distributed data with simple computations
2. Tolerate faults
What is a possible data-level parallelism (DLP) application?
MapReduce (MR)
What are the characteristics of MapReduce (MR)? [4]
1. By Google for ~20EB/d
2. Two phases map and reduce
3. Automatic parallelization across large-scale clusters
4. Handle failures, efficient communication and performance
What are possible usage of MapReduce (MR)? [6]
1. Web crawl // Outgoing links from HTML documents
2. Google Search
3. Google Earth
4. Google Maps
5. 10k MR programs at Google in 4 years
6. 100k MR jobs per day in 2008
What are the components of MapReduce (MR)? [6]
1. Data are key-value (K,V) pairs
2. Input split means to split data for single map tasks
3. Map function for each split
4. Partitioner for a set of mapper
5. Reduce function for one out_key in a partition
6. Tasks
What are the characteristics of data of MapReduce (MR)?
Data is shared among cluster but have single namespace
What are the characteristics of map function of MapReduce (MR)? [2]
1. Input in_key and in_val
2. Output out_key and intermed_value
What are the characteristics of partitioner of MapReduce (MR)?
Sort out_key together for a reducer by sort
What are the characteristics of reduce function of MapReduce (MR)? [3]
1. One key can have multiple appearances
2. Input out_key and list of intermed_value
3. Output out_value
What are the characteristics of tasks of MapReduce (MR)? [5]
1. In isolation
2. Limited communication for scalability and fault tolerance
3. Restart tasks in case of node failure
4. Run multiple task copies // Requires scheduling
5. One data chunk per map is common
What are the characteristics of scheduling of tasks of MapReduce (MR)? [7]
1. Master Slave architecture
2. Slave Task Tracker (TT)
3. Master Job Tracker (JT)
4. TT pulls periodically with task request // heartbeat
5. JT assigns Map locally and Reduce anytime
6. JT re-assigns jobs if heartbeat stops
7. JT re-assings slow tasks (so-called stragglers) // Speculative execution
What is a trend inspired by cloud computing?
Fog computing
What are the basic characteristics of fog computing? [3]
1. Computing on network edges with end-user clients
2. Good concept for IoT
3. Combination of cloud computing P2P and control plane
Which problems exist inside the Internet regarding SDN? [4]
1. Software connected to hardware // Buggy software
2. Operating the network is more than half the total cost
3. Slow protocol standardization and long delays for new features
4. Evolution of becoming faster instead of better // Compared to other areas
What are the two components of the traditional networking? [2]
1. Data plane (DP) (sending data) and 2. control plane (CP) (Where to send data)
What are the goals of traditional networking? [2]
1. Efficient routing // Distributed algorithms, adjusting weights, MPLS
2. Isolation // ACLs, VLANs, Firewalls
What are the consequences of traditional networking? [3]
1. No modularity
2. (Too) many mechanisms without abstraction
3. Limited functionality
What has to be done on the data plane with the packets? [6]
1. Forward
2. Filter
3. Buffer
4. Mark
5. Rate-limit
6. Measure
Which components decides what to do on the data plane with the packets? [2]
1. Forwarding state and 2. packet header
What influences the control plane? [4]
1. Track topology changes
2. Compute routes
3. Install forwarding rules
4. Generate forwarding states
What are possible management systems of the control plane? [2]
1. CLI
2. SNMP
What are possible routing protocols of the control plane? [3]
1. OSPF
2. ISIS
3. BGP
What are the characteristics of software engineering for networks? [2]
1. Abstractions for the network domain // Programming languages and OS
2. No additional functionality only better organization