AIT

Questions about the lecture 'Advanced Internet Technology' of the RWTH Aachen

Questions about the lecture 'Advanced Internet Technology' of the RWTH Aachen


Kartei Details

Karten 236
Sprache English
Kategorie Informatik
Stufe Universität
Erstellt / Aktualisiert 05.02.2017 / 10.10.2017
Weblink
https://card2brain.ch/box/20170205_ait_chapter_overview
Einbinden
<iframe src="https://card2brain.ch/box/20170205_ait_chapter_overview/embed" width="780" height="150" scrolling="no" frameborder="0"></iframe>

What are the characteristics of delete of cassandra?

Not right away add tombstone to log and delete later

What are the characteristics of read of cassandra? [4]

1. As write

2. Contact closest replica

3. Check consistency and initiate read-repair

4. Slower than write

How does the workload looks like? [3]

1. Large and unstructured data

2. Random reads and writes

3. Five needs

Which needs exist in the workload? [4]

1. Scalability // Scale out ad no scale up

2. Speed

3. Fault-tolerance

4. Low total cost of ownership (TCO)

What is the definition of scale up in the needs? [4]

1. Make bigger

2. Classical enterprise settings

3. Flexible ACID transactions

4. Transactions in a single node

What is the definition of scale out in the needs? [4]

1. Make more

2. Cloud friendly (key-value stores)

2. Queries at a single server // Limited functionality

3. No multi-row transactions

What is the CAP theorem by Brewer?

In distributed system one can satisfy at most 2/3 properties

Which are the components of the CAP theorem? [3]

1. Consistency means same data at any time

2. Availability means allows operations any time

3. Partition-tolerance means continuous work in spite of partitions

Which approaches exist for adding new nodes in cloud computing? [2]

1. Randomly by system

2. Manually by administrator

How to operate on distributed data?

Parallelize and process directly at location

How much faster are parallel programs by Amdahl’s law?

Speedup of parallelizable portion p is S=1/((1-p) + p/n)

What are the restrictions of Amdahl’s law speedup?

S is only upper bound due to overhead and workload imbalance

What are possible improvements of Amdahl’s law speedup? [3]

1. Maximize portion p

2. Balance workload

3. Minimize overhead

What are the characteristics of request-level parallelism (RLP)? [2]

1. Independent requests // Rarely connections across requests

2. Little read-write (so-called producer-consumer) sharing

What is a possible request-level parallelism (RLP) application?

Goolge’s Query-Serving Architecture

What are the characteristics of Goolge’s Query-Serving Architecture? [2]

1. Random distribution of data

2. Many redundant replicas used for load balance

How does the algorithm of Google’s Query-Serving Architecture looks like? [7]

1. Request to next Warehouse Scale Computer (WSC)

2. Load balancer redirects to cluster within WSC

3. Forward to Google Web Servers (GWS) handling request

4. Ask index server for relevant documents

5. Return relevant document list

6.1 Get indexed documents with IDs

6.2 Ad system and images

7.1 Order resulting documents

7.2 Decorate with sponsored links

What are the characteristics of data-level parallelism (DLP)? [2]

1. Compute largely distributed data with simple computations

2. Tolerate faults

What is a possible data-level parallelism (DLP) application?

MapReduce (MR)

What are the characteristics of MapReduce (MR)? [4]

1. By Google for ~20EB/d

2. Two phases map and reduce

3. Automatic parallelization across large-scale clusters

4. Handle failures, efficient communication and performance

What are possible usage of MapReduce (MR)? [6]

1. Web crawl // Outgoing links from HTML documents

2. Google Search

3. Google Earth

4. Google Maps

5. 10k MR programs at Google in 4 years

6. 100k MR jobs per day in 2008

What are the components of MapReduce (MR)? [6]

1. Data are key-value (K,V) pairs

2. Input split means to split data for single map tasks

3. Map function for each split

4. Partitioner for a set of mapper

5. Reduce function for one out_key in a partition

6. Tasks

What are the characteristics of data of MapReduce (MR)?

Data is shared among cluster but have single namespace

What are the characteristics of map function of MapReduce (MR)? [2]

1. Input in_key and in_val

2. Output out_key and intermed_value

What are the characteristics of partitioner of MapReduce (MR)?

Sort out_key together for a reducer by sort

What are the characteristics of reduce function of MapReduce (MR)? [3]

1. One key can have multiple appearances

2. Input out_key and list of intermed_value

3. Output out_value

What are the characteristics of tasks of MapReduce (MR)? [5]

1. In isolation

2. Limited communication for scalability and fault tolerance

3. Restart tasks in case of node failure

4. Run multiple task copies // Requires scheduling

5. One data chunk per map is common

What are the characteristics of scheduling of tasks of MapReduce (MR)? [7]

1. Master Slave architecture

2. Slave Task Tracker (TT)

3. Master Job Tracker (JT)

4. TT pulls periodically with task request // heartbeat

5. JT assigns Map locally and Reduce anytime

6. JT re-assigns jobs if heartbeat stops

7. JT re-assings slow tasks (so-called stragglers) // Speculative execution

What is a trend inspired by cloud computing?

Fog computing

What are the basic characteristics of fog computing? [3]

1. Computing on network edges with end-user clients

2. Good concept for IoT

3. Combination of cloud computing P2P and control plane

Which problems exist inside the Internet regarding SDN? [4]

1. Software connected to hardware // Buggy software

2. Operating the network is more than half the total cost

3. Slow protocol standardization and long delays for new features

4. Evolution of becoming faster instead of better // Compared to other areas

What are the two components of the traditional networking? [2]

1. Data plane (DP) (sending data) and 2. control plane (CP) (Where to send data)

What are the goals of traditional networking? [2]

1. Efficient routing // Distributed algorithms, adjusting weights, MPLS

2. Isolation // ACLs, VLANs, Firewalls

What are the consequences of traditional networking? [3]

1. No modularity

2. (Too) many mechanisms without abstraction

3. Limited functionality

What has to be done on the data plane with the packets? [6]

1. Forward

2. Filter

3. Buffer

4. Mark

5. Rate-limit

6. Measure

Which components decides what to do on the data plane with the packets? [2]

1. Forwarding state and 2. packet header

What influences the control plane? [4]

1. Track topology changes

2. Compute routes

3. Install forwarding rules

4. Generate forwarding states

What are possible management systems of the control plane? [2]

1. CLI

2. SNMP

What are possible routing protocols of the control plane? [3]

1. OSPF

2. ISIS

3. BGP

What are the characteristics of software engineering for networks? [2]

1. Abstractions for the network domain // Programming languages and OS

2. No additional functionality only better organization