Set of flashcards FC (Page 1 of 2)

Flashcards	72
Language	English
Category	Computer Science
Level	University
Created / Updated	28.11.2020 / 17.07.2021
Weblink	https://card2brain.ch/box/20201128_dbt
Embed	<iframe src="https://card2brain.ch/box/20201128_dbt/embed" width="780" height="150" scrolling="no" frameborder="0"></iframe>

Cloud Computing Characteristics (NIST)

On-Demand Self-Service
Broad Network Access
Resource Pooling
Rapid Elasticity
Measured Service

Why is cloud computing not enough?

Requires continuous connectivity
Too high latency
Bandwidth limitations
Regulations / privacy requirements

What is the edge?

Outskirt of an administrative domain

What is Fog Computing?

What does it provide?

Extension of the cloud model
- applications can reside on multiple layers of a networks' topology
Combining cloud resources with edge devices and potential intermediary nodes in the network

Provides the ability to analyze data near the edge for
- improving efficiency or
- to operate while disconnected from a larger network
Cloud service can be used for tasks that require mode resources or elasticity

Fog Computing Characteristics

Runs required computations near the end-user
Uses lower latency storage at or near the edge
Uses low latency communication
Implements elements of management
Uses Cloud for strategic tasks
Multi-tenancy on a massive scale is required for some use cases
Geo Distributed
- Physical location is significant
- A dynamic pool of sites => unreliable connections between sites
- Sites may be resource-constrained

In which areas does Fog Computing benefit?

Data Collection, Analytics & Privacy
Security
- Moving security closed to the edge => higher performance security applications
Compliance Requirements
- Geofencing, data sovereignty, copyright enforcement
Real-Time

Challenges to Adoption of Fog Computing (inherent)

General
- They result from the very idea of using fog resources
- Technical constraints
  - limits of computational power
- Logical constraints
  - tradeoffs in distributed systems
- Market constraints
  - there are currently no managed edge services
No Edge Services
Lack of Standardized Hardware
Management Effort
Managing QoS
- IoT or autonomous cars have stronger quality requirements
- More problems (network latency/partitioning, message loss/reordering) in non-centralized systems
No Network Transparency

Challenges to Adoption of Fog Computing (external)

General
- The result from external entities
- Government agencies
- Attackers
Physical Security
- E.g attaching hardware on top of street light pole instead of eye level
- Protection against fire and vandalism
Legal and Regulatory Requirements
- Data needs to be held in a certain physical location (eHealth)
- Liquid Fog-based applications might have trouble fulfilling certain aspects of privacy regulations

Synchronous Communication

Example: phone call, method call in Java
Requires both parties to be on-line
The caller must wait and both, server and client need to be alive
Disadvantages
- Higher probability of failures
- Difficult to identify and react to failures
- The one-to-one system is not practical for complex interactions
Finding out when the failure took place is not easy

Asynchronous Communication

Clients can do other things when they are waiting
Examples: Email, JavaScript callbacks

Types of decoupling

Space (Location)
Time
Technology
Data Format

Messaging patterns

Request / Response (1 to 1)
Load Balancing (1 to many)
Fan-out / Fan-in (1 to many / many to 1)
Broadcasting (many to many)
Pub/Sub (many to many, but structured)

What is Pub/Sub Messaging?

Clients can act as Publisher or Subscriber or both
Communication is many-to-many

Pub/Sub: Matching of Events and Subscriptions

Channel-based (low level of expressiveness)
Topic-based
Content-based (high level of expressiveness)

Pub/Sub: Broker vs. P2P

The broker handles client communication centrally
P2P clients have to route messages themselves
Broker-based setups are a good fit for fog

MQTT Pub/Sub Protocol

Lightweight and designed for devices that run in constrained environments
Topic-based
Broker-based

Inter-Broker Routing Strategies

+ basic description

Event Flooding or Subscription Flooding
- Events/Subscriptions broadcasted to all brokers
- Minimizes end-to-end latency
- A lot of excess data
Gossiping
- Messages are distributed based on probability distribution
- High tolerance for very dynamic environments
- Messages might not arrive at all or with high delay
Selective - Filtering
- Good if not all brokers are interconnected
- Subscription information is exchanged with neighbors
- Events are only forwarded to brokers that lie on a path to a subscription
Selective - Rendezvous Points (RP)
- RPs are the meeting points for events and subscription
- Must be close to clients -> otherwise high-end latency

Case Studies / Broadcast Groups

Combining flooding and rendezvous points
- Global flooding
  - Broadcast messages to all brokers
  - Communication latency is optimal, but a lot of excess data
- Rendezvous point in the cloud
  - Fog broker forward events to a central cloud broker
    - => cloud decides which other fog brokers need events
  - Minimizes excess data, but increases latency
- Tradeoff between latency and excess data dissemination

Case Studies / Broadcast Groups / Broadcast group formation

Initially, each broker takes the role of a leader
Leaders subscribe to a dedicated topic at the cloud RP to detect other leaders
Leaders measure latency to other leaders
- If below a given latency threshold => merge
- Merge: determine new group leader (e.g. based on compute resources)
- Migrate members to new leader
If latency to a leader is above given latency threshold, leave group
Latency threshold controls group size
Can be used to manage the latency vs. excess data tradeoff

Case Studies / Vehicular Fog Computing

Vehicles
- Collect data
- Use it for vehicle-level decisions
- Transmit data to closest fog nodes
  - Asynchronous Request/Reply or Fan-Out
Fog nodes
- Process data of multiple cars of area-level decisions
- Send instructions to traffic lights
  - Synchronous Fan-In / Fan-Out
- Send aggregated status reports to cloud
  - Synchronous Fan-In
Traffic lights
- Operate as defined by instructions
Cloud
- Processes data from fog nodes of city-level decision
- There might be an internal load balancer
- Publish traffic information to subscribed vehicles
  - Pub/Sub

Case Studies / DisGB

IoT data distribution is often non-uniform
It depends on where events are relevant / where relevant events can come from
Can be expressed with geo-context
Idea: use geo-contexts to identify RPs (two strategies)
- The event geofence can be used to identify RPs that are close to the subscribers of an event
  - The RPs for an event are all brokers that are the respectively closest broker to each of the subscribers that have created a matching subscription
    - Subscriptions are not distributed
    - Similar to flooding events, events are distributed
- The subscription geofence can be used to identify RPs that are close to the publisher events
  - The RP for an event is the broker closest to the publisher of that event
    - Events are not distributed
    - Similar to flooding subscriptions, subscriptions are distributed

What is replication?

Is a common strategy in data management and in distributed systems
Main idea
- maintain multiple companies of an entity (called replicas)
- on multiple servers
- for better availability
- and performance
Keeping replicas consistent is costly

Why do we need replication?

System availability / Fault-tolerance
- Failure resilience is critical in any enterprise system
- Keeping several copies of the server -> single failures should not affect the overall availability
- Redundancy allows switch over in case of failures
Replicas can protect against corrupted data (voting)
Performance / Scalability
- Large workloads can be spread and balanced across distributed replicas
- Local access is fast, remote access is slow
  - Keep copies in clients’ proximity

What are three different replication scenarios?

Replicating server on a common resource may help availability if there is a replicated cache coherence mechanism
To get improvement in availability the resources must be replicated too
Replicated servers and resource replicas are not necessarily tightly coupled

What is replica consistency?

Read to any replica returns the result of the latest write to the logical data store
Consistency is expensive, hence different consistency models exist

What does CAP and PACELC mean?

CAP
- Consistency
- Availability
- Partition tolerance
PACELC
- CAP Else Latency and Consistency (when the system is running normally / absence of partitions)

Characterizing consistency?

Staleness
- How much is a given replica lagging behind?
Ordering
- How much does the operation serialization order deviate among replicas?

Data-centric consistency models

Sequential Consistency
- All replicas execute all updates in the same order
Causal consistency
- All replicas execute causally-realted operations in the same order, concurrent request are executed in arbitrary order
Eventual Consistency
- In the absence of updates and failures, all replicas converge towards the same state

Client-centric consistency models

Monotonic Reads
- A read will never return older values than previously returned to the same client
Read Your Writes
- A read will never return older values than previously written by the same client
Write Follows Reads
- A client read version X and then updates the same data time, will only update replicas that have at least version X
Monotonic Writes
- Two updates of the same client will always be serializes corresponding to the chronicle order of their submission

What are the two parameters when designing a replication strategy?

When updates are propagated
Where updates are propagated

Replication strategies / When: What are the two options?

And how do they work?

Synchronous (eager)
- Propagates changes to the data immediately to all existing copies (before the commit)
- The ACID properties can apply to all replica updates
- Data copies are consistent at all times and at all sites
- On update: consult with everybody else and only if an agreement among sites is reached the data is updated
  - However, the system is unavailable for updates if only a single replica cannot be reached
Asynchronous (lazy)
- First executed and committed on the local copy, then propagates changes
  - During propagations, copies are inconsistent
- The update is eventually propagated to all sites (push/pull) and assuming no conflicts arise, the data eventually becomes consistent

Replication strategies / Where: What are the two options?

And how do they work?

Primary copy (master)
- Only one copy where the update can originate, all other copies (secondary) are updated reflecting the changes to the master
- Secondary copies are read-only
Updates everywhere (group)
- Changes can be initiated at any of the copies

Advantages and Disadvantages of synchronous replication?

Advantages
- No inconsistencies (identical copies)
- Regarding the local copy yields the most up-to-date value
- Changes are atomic
Disadvantages
- An operation has to update all sites
  - Linger execution time
  - Worse response time
  - Poor availability

Advantages and Disadvantages of asynchronous replication?

Advantages
- An operation is always local
  - Good response time
  - High availability
Disadvantages
- Data inconsistencies
- A local read does not always return the most up-to-date value
- Changes to all copies are not guaranteed
- Replication is not transparent

Advantages and Disadvantages of update everywhere replication?

Advantages
- Any site can run an operation
- Load is evenly distributed
Disadvantages
- Copies must be synchronized
- Concurrent updates will cause conflicts

Advantages and Disadvantages of primary copy replication?

Advantages
- No inter-site synchronization
- There is always one site that has all the updates
Disadvantages
- The load at the primary copy can be quite large
- Reading the local copy may not yield the most up-to-date value

Synchronous + Primary copy

Advantages/Disadvantages?

Practical?

Advantages
- Updates do not need to be coordinated
- No inconsistencies
Disadvantages
- Longest response time
- Only useful with few updates
- Local copies are read-only
- Low availability
Ideal: Globally correct, Remote writes
Practical: Too expensive (usefulness)

Asynchronous + Primary copy

Advantages/Disadvantages?

Practical?

Advantages
- No coordination necessary
- Short response times
Disadvantages
- Local copes are not up-to-date
- Inconsistencies
- Low write availability
Ideal: Inconsistency reads
Practical: Feasible (limited scalability)

Synchronous + Update everywhere

Advantages/Disadvantages?

Practical?

Advantages
- No inconsistencies
- Elegant symmetric solution
Disadvantages
- Long response times
- Updates need to be coordinated
- Low availability
Ideal: Globally correct, Local writes
Practical: Too expensive (does not scale)

Asynchronous + Update everywhere

Advantages/Disadvantages?

Practical?

Advantages
- No centralized coordination
- Shortest response times
- High availability
Disadvantages
- Inconsistencies and conflicts
- Updates can be lost (reconciliation)
Ideal: Inconsistency reads, Reconciliation
Practical: Feasible in many applications

FC

Create or copy sets of flashcards

Create or copy sets of flashcards

Log in to see all the cards.

SWITCHaai

Office 365

Edulog

Apple ID

Google