Big Data Flashcards

1
Q

Redshift characteristics

A
  • It’s for BI applications
  • It’s relational (but not a replacement for RDS in traditional applications)
  • It can store up to 16 PB of data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

EMR (Elastic MapReduce)

A

Is a managed big data platform that allows you to process vast amounts of data using open-source tools, such as Spark, HBase, Flink, Hudi, and Presto. It’s AWS’s ETL tool.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How does EMR (Elastic MapReduce) work under the hood?

A

EMR is a managed fleet of EC2 instances running open-source tools (Spark, HBase, Flink, Hudi, and Presto)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Do EC2 rules apply to EMR (Elastic MapReduce)?

A

Yes. You can use RIs and Spot instances to reduce your costs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Do EMR (Elastic MapReduce) lives inside a VPC?

A

Yes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Does Redshift support Multi-AZ deployments?

A

No. Redshift only supports Single-AZ deployments. You can create multiple clusters in different AZs, but they’re technically separate deployments. It’s not highly available by default.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What’s the only service with a real-time response?

A

Kinesis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

AWS queuing services

A

SQS and Kinesis can both be queues. Each service has its pros and cons. SQS is easier and simpler, and Kinesis is faster and can store data for up to a year.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Serverless SQL

A

Athena

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How to query data in S3?

A

Athena

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Serverless ETL

A

Glue. It can help create the schema for your data when paired with Athena

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Big Data/BI dashboard/data visualisation?

A

QuickSight

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Elasticsearch

A

Excels when it’s combined with Logstash and Kibana. This creates an ELK stack and is a very common way to search over your server logs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly