Set of flashcards DataMgmt (Page 2 of 3)

Flashcards	81
Language	English
Category	Computer Science
Level	University
Created / Updated	31.05.2023 / 31.05.2023
Weblink	https://card2brain.ch/box/20230531_datamgmt
Embed	<iframe src="https://card2brain.ch/box/20230531_datamgmt/embed" width="780" height="150" scrolling="no" frameborder="0"></iframe>

What are some common window functions in SQL?

Common window functions in SQL include ROW_NUMBER, RANK, DENSE_RANK, NTILE, LAG, LEAD, FIRST_VALUE, LAST_VALUE, and SUM/AVG/MIN/MAX with an OVER clause.

What is a stored procedure in SQL?

A stored procedure in SQL is a precompiled set of SQL statements that can be executed repeatedly by calling the procedure name. It allows you to encapsulate complex logic and business rules into a single unit that can be reused across multiple applications.

How do you create a stored procedure in SQL?

To create a stored procedure in SQL, you can use the CREATE PROCEDURE statement followed by the name of the procedure and its parameter list (if any). For example: CREATE PROCEDURE my_proc @param1 INT, @param2 VARCHAR(50) AS BEGIN SELECT * FROM my_table WHERE col1 = @param1 AND col2 = @param2 END; This query creates a stored procedure called "my_proc" that takes two parameters (@param1 and @param2) and selects all rows from "my_table" where col1 matches @param1 and col2 matches @param2.

What are some benefits of using stored procedures in SQL?

Some benefits of using stored procedures in SQL include improved performance, increased security, reduced network traffic, improved code organization and maintainability, and easier application development.

What is normalization in database design?

Normalization in database design is the process of organizing data into tables to reduce redundancy and improve data integrity. It involves breaking down larger tables into smaller ones based on their functional dependencies.

What are some common normalization forms in database design?

Common normalization forms in database design include first normal form (1NF), second normal form (2NF), third normal form (3NF), Boyce-Codd normal form (BCNF), fourth normal form (4NF), and fifth normal form (5NF). Each successive normal form builds on the previous one to further reduce redundancy and improve data integrity.

What are some benefits of normalization in database design?

Some benefits of normalization in database design include improved data consistency, reduced data redundancy, easier maintenance and updates, improved query performance, and better scalability.

What are Common Table Expressions (CTEs) in SQL?

Common Table Expressions (CTEs) in SQL are temporary named result sets that can be defined within the execution scope of a SELECT, INSERT, UPDATE, or DELETE statement. They are derived from a simple query and can be used as an alternative to derived tables (subqueries), views, and inline user-defined functions.

What is the purpose of using CTEs in SQL?

The purpose of using CTEs in SQL is to simplify complex queries by breaking them down into smaller, more manageable parts. They can also be used to deal with hierarchical or tree-structured data by using the recursive clause.

Which versions of SQL support CTEs?

CTEs were introduced as part of the SQL:1999 standard and have been supported by Oracle (CONNECT BY), MS SQL Server, DB2, Firebird, and other databases for many years.

How do you define a CTE in SQL?

To define a CTE in SQL, you use the WITH keyword followed by the name of the CTE and its definition. For example: WITH my_cte AS ( SELECT col1, col2 FROM my_table WHERE col3 = 'value' ) SELECT * FROM my_cte WHERE col1 > 10; This query defines a CTE called "my_cte" that selects columns "col1" and "col2

What is a key-value store?

A key-value store is a type of NoSQL database that stores data as a collection of key-value pairs. Each value can be any type of data, including strings, numbers, or even other key-value pairs.

How does a distributed key-value store work?

In a distributed key-value store, data is partitioned across multiple nodes in the system. Each node is responsible for storing a portion of the total dataset. When a client requests data, it sends the request to one of the nodes, which retrieves the data and returns it to the client.

What are some common use cases for key-value stores?

Key-value stores are often used for caching frequently accessed data, storing session information for web applications, and managing user profiles or preferences.

What is a graph database?

A graph database is a type of NoSQL database that stores data as nodes and edges in a graph structure. Nodes represent entities such as people or objects, while edges represent relationships between them.

How does querying work in a graph database?

Graph databases use traversal-based queries to navigate through the graph structure and retrieve related nodes and edges. This allows for complex queries that can find patterns or connections between different entities.

What are some advantages of using a graph database over other types of databases?

Graph databases excel at handling complex relationships between entities, making them well-suited for use cases such as social networks, recommendation engines, and fraud detection systems.

What is the difference between a traditional RDBMS and a NoSQL database?

A traditional RDBMS stores data in tables with a fixed schema, while NoSQL databases store data in various formats, including document-based, key-value pairs, and graph-based.

What are some advantages of using a NoSQL database?

NoSQL databases can handle large amounts of unstructured or semi-structured data more efficiently than traditional RDBMS. They also offer greater scalability and flexibility.

What is meant by the term "Polyglot Persistence"?

Polyglot Persistence refers to the practice of using multiple types of databases to store different types of data within an application. This allows developers to choose the best database for each specific use case.

What is a document-based database?

A document-based database stores data as documents, typically in JSON or BSON format. Each document can have its own unique structure and schema.

How does sharding work in a distributed database system?

Sharding involves partitioning data across multiple servers in order to improve performance and scalability. Each server only holds a portion of the total dataset, allowing for faster queries and easier scaling.

What is meant by "eventual consistency" in a distributed database system?

Eventual consistency means that updates made to the database will eventually propagate to all nodes in the system, but there may be some delay or inconsistency during this process. This is often used as a trade-off for improved performance and availability.

What is a column-family database?

A column-family database is a type of NoSQL database that stores data as columns rather than rows. Each column can contain multiple values, and each row can have a different set of columns.

How does data modeling work in a column-family database?

In a column-family database, data is organized into column families, which are similar to tables in a traditional RDBMS. Each column family can have its own schema and indexing strategy.

What are some common use cases for column-family databases?

Column-family databases are often used for storing large amounts of time-series or event-based data, such as log files or sensor readings.

What is meant by the term "CAP theorem"?

The CAP theorem states that it is impossible for a distributed system to simultaneously provide all three of the following guarantees: consistency, availability, and partition tolerance.

How do different types of databases prioritize the guarantees provided by the CAP theorem?

Different types of databases make different trade-offs between consistency, availability, and partition tolerance. For example, traditional RDBMS prioritize consistency over availability and partition tolerance, while NoSQL databases often prioritize availability and partition tolerance over strong consistency.

How does sharding affect the guarantees provided by the CAP theorem?

Sharding can improve partition tolerance by allowing data to be distributed across multiple nodes in the system. However, it can also make it more difficult to maintain strong consistency across all nodes.

What is meant by the term "eventual consistency"?

Eventual consistency is a property of distributed systems in which updates to data are propagated asynchronously and may take some time to be fully replicated across all nodes in the system. As a result, different nodes may have slightly different views of the data at any given time.

How do NoSQL databases typically achieve eventual consistency?

NoSQL databases often use techniques such as vector clocks or conflict resolution algorithms to reconcile conflicting updates and ensure that all nodes eventually converge on a consistent view of the data.

What are some potential drawbacks of eventual consistency?

Eventual consistency can make it more difficult to reason about the state of the system at any given time, since different nodes may have different views of the data. It can also make it more difficult to enforce constraints or perform transactions that span multiple nodes.

What is meant by the term "polyglot persistence"?

Polyglot persistence refers to the practice of using multiple types of databases within a single application or system, each optimized for a specific type of data or workload.

What are some benefits of using polyglot persistence?

Polyglot persistence allows developers to choose the best tool for each job, rather than trying to fit all data into a single database model. This can lead to better performance, scalability, and flexibility.

What are some challenges associated with polyglot persistence?

Polyglot persistence can add complexity to an application or system, since developers must manage multiple types of databases and ensure that they work together seamlessly. It can also make it more difficult to maintain consistency across different types of data.

What is meant by the term "data warehouse"?

A data warehouse is a large, centralized repository of data that is used for reporting and analysis. It typically contains historical data from multiple sources, organized in a way that makes it easy to query and analyze.

How does a data warehouse differ from a traditional transactional database?

A data warehouse is optimized for read-heavy workloads and complex queries, while a transactional database is optimized for write-heavy workloads and simple queries. Data warehouses also typically contain denormalized or aggregated data, rather than raw transactional data.

What are some common use cases for data warehouses?

Data warehouses are often used for business intelligence, reporting, and analytics. They can be used to answer questions such as "What were our sales by region last quarter?" or "Which products are most frequently purchased together?"

What is meant by the term "data integration"?

Data integration refers to the process of combining data from multiple sources into a single, unified view. This can involve tasks such as cleaning and transforming the data, resolving conflicts between different sources, and ensuring that the resulting dataset is consistent and accurate.

What are some common challenges associated with data integration?

Data integration can be challenging due to differences in format, structure, and semantics between different sources of data. It can also be difficult to ensure that the resulting dataset is complete and accurate.

DataMgmt

Create or copy sets of flashcards

Create or copy sets of flashcards

Log in to see all the cards.

SWITCHaai

Office 365

Edulog

Apple ID

Google