AIT

Questions about the lecture 'Advanced Internet Technology' of the RWTH Aachen

Questions about the lecture 'Advanced Internet Technology' of the RWTH Aachen


Fichier Détails

Cartes-fiches 236
Langue English
Catégorie Informatique
Niveau Université
Crée / Actualisé 05.02.2017 / 10.10.2017
Lien de web
https://card2brain.ch/box/20170205_ait_chapter_overview
Intégrer
<iframe src="https://card2brain.ch/box/20170205_ait_chapter_overview/embed" width="780" height="150" scrolling="no" frameborder="0"></iframe>

What holds for real P2P networks? [2]

1. Depends on protocol

2. Gnutella implies a SF distribution following power-law rule // Vertices with high degree answer more likely to new PING messages

What is the basic idea or principle of structured P2P approaches?

The distributed hash tables (DHT)

What is the idea of distributed indexing? [2]

1. Each peer has responsibility for set of data via function f

2. Find peer with set of data via function f

What are the characteristics of distributed management of data? [3]

1. Peer with address space is responsible for data with same address space

2. Share data responsibility if nodes appears or disappear

3. Using distributed hash table (DHT) to map peer to data

What are the characteristics of direct storage? [2]

1. Value is data

2. Good for small data but inflexible for big

What are the characteristics of indirect storage? [2]

1. Value is storage address (K/V-pair)

2. Good for big data but one more step

What are the characteristics of DHT? [2]

1. Hash(String) → 0,… ,2^m-1

2. Overlapping of responsibility

What are the search costs of hash tables? [2]

1. Search in O(1) with centralized hash table

2. Search in O(log n) with DHT and n peers storing O(log n) routing information

What is the idea of an efficient routing algorithm with DHT?

Data request to arbitrary peer with information about closer destination

What are the challenges of DHT? [4]

1. Make the network flexible reliable and scalable

2. Equal distribution of content

3. Routing strategies working with small routing tables

4. Assignment and reassignment of responsibilities

What are the interface functions of DHT? [2]

1. Update information with Put(key, value)

2. Request information with Get(key) → value

What are the comparisons of DNS and DHT? [2]

1. DNS maps node to IP address (Computers and services) and DHS maps key to value (Addresses documents and data)

2. DNS refers to hierarchical structure and domains and DHS does not need servers or name spaces

Which implementations of DHT exist? [8]

1. Chord 2. Pastry 3. CAN 4. Tapestry 5. P-Grid 6. Kademlia 7. Symphony and 8. Viceroy

What are the characteristics of chord? [11]

1. UC Berkeley and MIT

2. Identifier with 160 bits

3. Key is sha-1(data-item-descriptor)

4. ID is sha-1(IP addresses + port)

5. Ring topology with clockwise next node (so-called successor) // No proximity based routing

6. Routing table contents are links

7. Consistent hashing with mapping nodes into linear address

8. Fault tolerance

9. Adding node needs to update nodes in charge of key n-2^i in O(log n²)

10. Departing node needs to contact successor predecessor and nodes at finger distance and copy data in O(log n²)

11. Lookup latency ~ log2(n)/2

 

What are possible consistent hashings? [3]

1. Primitive 2. Advanced and 3. Chord finger tables

What the characteristics of primitive consistent hashing? [3]

1. Links to next successor

2. Search in O(n)

3. Node failure breaks system

What the characteristics of advanced consistent hashing? [2]

1. Links to z successors

2. For z=n fully meshed routing system as server

What the characteristics of chord finger tables consistent hashing? [2]

1. Links to log(N) successors

2. Node n has fingers pointing on n+2^i and corresponding successor

What are the characteristics of fault tolerance for cord? [4]

1. Responsibilitye for routing and data storage by application // layered design

2. Delete nodes after timeout // Soft state approach

3. Forward to previous finger if finger failure and replace entry with successor

4. Store successor list if successor failure

What are the characteristics of pastry? [5]

1. Microsoft Research and Rice University

2. Used in PAST Squirrel

3. Identifier with l=128 bits identified by base 2^b

4. Prefix based tree topology with closest node

5. Lookup latency in O(log n)

What are possible routing algorithms for pastry? [6]

1. Tree-based search

2. Prefix-based routing

3. Route towards numerically closest node in O(log 2bn)

4. Nodes know own prefix area and base-1 on each own level

5. Long range

6. Close range

What are the characteristics of the table in pastry? [5]

1. l/b rows // One per prefix bit

2. 2^b-1 columns // One per other bit at position I

3. Topologically closest node latest matching bit (row) and next bit of requested object (column)

4. Access in O(1) with possible empty node

5. O(log n) entries

What is the definition of the long range routing for pastry? [3]

1. Check leaf set

2. Node with longer prefix match

3. Node with same prefix match but closer

What is the definition of pastry close range routing? [2]

1. Check leaf set

2. Closest leaf node

What are the characteristics of fault tolerance for pastry? [4]

1. Store leaf set for routing and recovery with periodic verification

2. Store neighbor set for routing

3. Replace corrupt leaf set entry with alternative from Leaf(L) and Leaf(L/2)

4. Replace corrupt routing table entry with help from alternative node in same row or next row

What has to be considered if adding nodes in pastry? [5]

1. Copy neighbor set of topologically nearest node

2. Get row of each node from routing to Z

3. Copy leaf set of Z

4. Fill empty entries with visited nodes

5. Delete own entries

What are the characteristics of CAN? [4]

1. UC Berkeley and ICSI

2. Identifier in x-dimensional value space

3. Node manages one partition and content inside

4. Neighbors if D-1 edges (not corners) touch // Expected O(2D)

What are the characteristics of the partitioning tree in CAN? [5]

1. Reflects history of partitioning // Label with 0 and 1

2. New node gets half space of node area his inserted to and label 1 if top or right

 

3. Important in case of failures

4. Neighbor gets partition of removed node

5. TAKEOVER if smallest neighbor signal indicates failure

How can the partitions of a removed node be distributed in CAN? [2]

1. Merge according to tree

2. Add to smallest number of keys if 1. is not possible

How does routing in CAN works?

Give request to neighbor nearest to destination

What are possible improvements for CAN? [5]

1. More dimensions with shorter paths

2. More concurrent coordination systems (so-called realities) with better redundancy

3. Routing metrics with delay instead of distance

4. Overlap regions

5. Equal partition of regions

What are the complexity characteristics of CAN? [2]

1. Search effort in O(D/4 *N^(1/D))

2. Memory requirement in O(D)

What are the characteristics of internet indirection infrastructure (i3)? [3]

1. Wheeler with “Any problem in cs can be solved with another layer of indirection, though creating a new problem.”

2. Parallel to TCP or UDP create indirection layer on top of IP

3. Separates data (infrastructure) and control plane (host)

What is possible with internet indirection infrastructure (i3)? [3]

1. Mobile IP

2. Multicast

3. Anycast

What are the characteristics of mobile IP of i3? [2]

1. Communication is done via trigger

2. Mobility possible

What are the characteristics of multicast of i3? [2]

1. Each receiver sets trigger

2. Distribution trees with further triggers for scalability

What holds for the associate-data/service-with-ID paradigm? [4]

1. before was location-to-location paradigm

2. Every packet is associated with 256bit ID

3. Receivers subscribe for IDs

4. Similar to mailing lists or publish/subscribe systems

What are the characteristics of indirection? [4]

1. Implementation with DHTs

2. Best-effort-type service

3. Update triggers via soft-state

4. Reliability flow-/congestion-control by end system

What are the motivations for cloud computing? [2]

1. Growth of data volumes

2. All human spoken words 5EB

What are examples for data growth in the last ten years? [4]

2006 Wayback Machine with 2PB

2007 NOAA with 1PB climate data

2008 Google with processing 20PB per day

2008 CERN’s LHC with generating 15PB per year