Common Technical Interview Questions and Answers Update on September 28, 2021

Exam Question 1

What was Hadoop named after?

A. Creator Doug Cutting’s favorite circus act
B. Cutting’s high school rock band
C. The toy elephant of Cutting’s son
D. A sound Cutting’s laptop made during Hadoop’s development
Correct Answer:
C. The toy elephant of Cutting’s son

Exam Question 2

All of the following accurately describe Hadoop, EXCEPT:

A. Open source
B. Real-time
C. Java-based
D. Distributed computing approach
Correct Answer:
B. Real-time

Exam Question 3

True or false? Due to Hadoop’s ability to manage unstructured and semistructured data and because of its scale-out support for handling ever-growing quantities of data, many experts view it as a replacement for the enterprise data warehouse.

A. True
B. False
Correct Answer:
B. False

Exam Question 4

Hadoop is a framework that works with a variety of related tools. Common cohorts include:

A. MapReduce, Hive and HBase
B. MapReduce, MySQL and Google Apps
C. MapReduce, Hummer and Iguana
D. MapReduce, Heron and Trumpet
Correct Answer:
A. MapReduce, Hive and HBase

Exam Question 5

True or false? Hadoop can be used to create distributed clusters, based on commodity servers, that provide low-cost processing and storage for unstructured data, log files and other forms of big data.

A. True
B. False
Correct Answer:
A. True

Exam Question 6

According to analysts, for what can traditional IT systems provide a foundation when they’re integrated with big data technologies like Hadoop?

A. Big data management and data mining
B. Collecting and storing unstructured data
C. Management of Hadoop clusters
D. Data warehousing and business intelligence
Correct Answer:
D. Data warehousing and business intelligence

Exam Question 7

YARN support, enabling tens of thousands of nodes support in a single cluster, was added to Hadoop in which update?

A. The 2.0 series
B. The 3.0 series
C. The 3.1 series
D. The 3.2 series
Correct Answer:
B. The 3.0 series

Exam Question 8

Hadoop benefits big data users for the following reasons except:

A. It can store and process vast amounts of structured, semistructured and unstructured data, quickly
B. It can support real-time analytics to help drive better operational decision-making
C. It protects application and data processing against hardware failures
D. It requires data to be preprocessed for storage before filtering it for specific analytic uses
Correct Answer:
D. It requires data to be preprocessed for storage before filtering it for specific analytic uses

Exam Question 9

True or false? MapReduce can best be described as a programming model used to develop Hadoop-based applications that can process massive amounts of unstructured data.

A. True
B. False
Correct Answer:
A. True

Exam Question 10

Which company consolidated Hadoop’s rivals in 2019?

A. Hortonworks
B. IBM
C. Amazon
D. Cloudera
Correct Answer:
D. Cloudera

Exam Question 11

What is still the most reliable method to ensure backups stay ransomware-free?

A. Disk-based backup
B. Tape backup
C. SaaS-based cloud backup
D. None of the above
Correct Answer:
B. Tape backup
Answer Description:
Tape storage for backups is the best way to keep ransomware from entering backup storage. Any medium for storage that is connected to a network is at risk, and certain malware and ransomware variants are even designed to go after unstructured data and certain file types, such as JPGs, PDFs or Microsoft Office documents — files that often contain mission-critical data. These variants can also go after restore point data and shadow copies. In essence, if the backup medium is plugged in somewhere, it’s at risk.

While tape storage has its own drawbacks, such as slower restore times compared to disk-based backups, air gapping mission-critical data remains the industry standard to ensure backups are not compromised. Organizations that use disk-based backup can implement a disk-to-disk-to-tape process to retain the benefits of disk, but also ensure data gets written onto tape.

Exam Question 12

Users who aren’t tech savvy don’t have the means to help protect an organization’s data or their backups should a ransomware attack hit.

A. True
B. False
Correct Answer:
B. False
Answer Description:
Historically, email has been the most common way that ransomware enters an organization’s infrastructure. Ransomware and crimeware-as-a-service groups often turn to phishing emails and craft carefully designed emails to target certain individuals and organizations, studying their social media presence, online profiles and business intel to make illegitimate emails look real. A user could click on an infected email, for example — which attackers can disguise as an internal email from a co-worker — and that’s all it takes.

Organizations that want to protect their users and backup integrity should invest resources into training employees and stakeholders to spot and be mindful of suspicious emails, webpages and social media posts. Users who are wary of what’s on their screens and what they click on put an organization in a much better position to prevent ransomware code from entering its storage and backup environments.

Exam Question 13

Replication alone is one of the best methods to create ransomware-free backups.

A. True
B. False
Correct Answer:
B. False
Answer Description:
Replication on its own does “almost nothing” to protect against ransomware, according to TechTarget contributor Brien Posey. By itself, replication can be a useful tool to hedge against data loss should a virtual machine fail, but many replication engines can’t detect malicious files against genuine data. Therefore, admins could back up malicious files as well, rendering those backups unusable.

As a best practice, use replication in conjunction with the creation of multiple recovery points. This enables IT admins to restore from a recovery point just before the ransomware attack hit, even if a replica VM is hit with ransomware, ensuring the restore data is ransomware-free. If an organization’s replication engine doesn’t support multiple recovery point creation, it may be time to explore other ransomware backup options.

Exam Question 14

Ransomware can encrypt files at what rate of speed?

A. Over the course of the day of attack
B. Weeks at a time
C. Days at a time
D. All of the above
Correct Answer:
D. All of the above
Answer Description:
Not all ransomware is introduced into a system instantaneously. In fact, some variants of ransomware begin encrypting files weeks in advance, moving clandestinely through an organization’s network for weeks, or even months, on end. These types of attacks can be profoundly expensive to abate, given — assuming admins don’t detect the attack for weeks on end — the ransomware is likely backed up multiple times. This can make it very difficult to identify clean backups from infected ones, which can be a daunting and time-consuming task.

As a best practice, test backups as often as possible and look for encrypted files or malicious code. Should ransomware hit, check backup catalogs for previously unencrypted files. Adopt a backup application that can identify previously unencrypted versions of files and use those to restore in conjunction with the revered 3-2-1 backup strategy to ensure clean copies of data lie elsewhere.

Exam Question 15

Backup vendors don’t address ransomware, so they won’t be helpful in protecting backups.

A. True
B. False
Correct Answer:
B. False
Answer Description:
Many backup vendors are doing quite the opposite. As ransomware attacks continue to increase, vendors have updated their products to address ransomware concerns and the infection of backups. They have added malware detection, two-factor authentication, endpoint protection, continuous data protection and other features.

Exam Question 16

Work-from-home environments can exacerbate the issue of ransomware and ransomware backup.

A. True
B. False
Correct Answer:
A. True
Answer Description:
Decentralized and edge data can be difficult to protect and the COVID-19 pandemic has caused an increase in attacks, particularly regarding endpoint security and edge data storage. Employees working remotely often store data on their personal hard drives, as opposed to the cloud or an enterprise file server. If ransomware enters their computer or other device, they could lose that data. This has led to an increase in attack surfaces for hackers.

It’s best practice to encourage employees to save files to a location where admins can back up that data. They can also use other remote backup tools to ensure user data isn’t lost if their device is hit with ransomware. Be sure to incorporate other remote security measures, such as strong patching, antimalware and limiting users’ access privileges to only job-essential files and applications.

Exam Question 17

What is immutability?

A. A type of retention policy
B. A term used to describe disk-to-disk backup
C. A write once, read many (WORM) storage technology
Correct Answer:
C. A write once, read many (WORM) storage technology
Answer Description:
Immutable storage is a technology that organizations can apply to virtually all types of storage media that prevents a file from being deleted or modified. That means if attackers enter an IT environment, whether internally or externally, they cannot tamper with those files. Use immutable backups with other ransomware backup best practices since immutability can keep backups safe even if an attack hits. In addition, use retention policies to determine how long to store data.

Exam Question 18

Cloud has become a popular option to protect backups from ransomware because:

A. The cloud is easy to scale
B. Cloud applications are often off office networks
C. Cloud vendors offer support for cloud backup networks
D. All of the above
Correct Answer:
D. All of the above
Answer Description:
Many organizations are turning to the cloud to keep their backups safe from ransomware. As a ransomware backup best practice, admins can use cloud in conjunction with other backup types to keep WORM copies, should ransomware infect other forms of backup. Cloud can also potentially provide faster recovery times.

Cloud is not completely immune to ransomware attacks, but if used properly, it can be a useful backup method. When using cloud to protect backups from ransomware, set user permissions for those who truly need access. Keep as many endpoints from accessing the cloud as possible. This can keep hackers and ransomware from entering cloud storage through an end user’s computer or devices.

Exam Question 19

Which two cloud providers joined forces to create Gluon?

A. IBM and Microsoft
B. AWS and Google
C. Google and IBM
D. AWS and Microsoft
Correct Answer:
D. AWS and Microsoft
Answer Description:
In October 2017, AWS and Microsoft teamed up to create Gluon, an open source deep learning library. Its interface and automation capabilities aim to make it easier for developers to build machine learning models.

Exam Question 20

True or false: Machine learning and deep learning are the same thing.

A. True
B. False
Correct Answer:
B. False
Answer Description:
While both AI technologies are similar, machine learning and deep learning have different applications. Machine learning enables software to learn and predict outcomes, such as how much wood a construction company might need to order in the spring, without additional programing. Deep learning, also known as deep neural networking, takes it a step further and looks deeper into data for trends and relationships. A deep learning service, for instance, could make recommendations on which movie you’d like to watch based on your viewing habits.

Exam Question 21

Which suite of services from Microsoft offers APIs to embed image and language processing capabilities into an app?

A. Azure Smart Cloud
B. Microsoft Cognitive Services
C. Azure Artificial Intelligence
D. Microsoft AI Services
Correct Answer:
B. Microsoft Cognitive Services
Answer Description:
Microsoft Cognitive Services offers a number of machine learning technologies that enable a developer to incorporate capabilities such as image processing and speech-to-text into an application. The suite includes software development kits and over 20 APIs.

Exam Question 22

Which AWS managed service aims to help enterprises more easily and quickly integrate machine learning-based models into applications?

A. Amazon SageMaker
B. Amazon Fargate
C. Amazon MagicMan
D. Amazon Sorcerer
Correct Answer:
A. Amazon SageMaker
Answer Description:
At AWS:reInvent 2017, the company introduced Amazon SageMaker. The machine learning and AI service primarily targets developers and aids with the creation, training and management of machine learning models. It comes with 10 of the most common machine learning algorithms built in.

Exam Question 23

Which product is at the center of IBM’s machine learning and AI portfolio?

A. IBM Moriarty
B. IBM Holmes
C. IBM Sherlock
D. IBM Watson
Correct Answer:
D. IBM Watson
Answer Description:
IBM Watson is a supercomputer that is at the core of the vendor’s cognitive services. With IBM Watson and its APIs, enterprises can create chatbots, analyze data, turn speech into text, identify emotions through text and more.

Exam Question 24

Which AWS service infuses machine learning into cloud security?

A. Amazon Stacie
B. Amazon Macie
C. Amazon Incognito
D. Amazon Cognito
Correct Answer:
B. Amazon Macie
Answer Description:
Amazon Macie is an automation tool that uses machine learning to discover, classify and protect data in the cloud provider’s Simple Storage Service. It monitors data and sends an alert in the event of suspicious activity. Amazon Macie can also automatically take action against certain threats.

Exam Question 25

True or false: Azure Machine Learning Studio doesn’t require programming to build predictive analysis models.

A. True
B. False
Correct Answer:
A. True
Answer Description:
Azure Machine Learning Studio has an interactive, visual interface with drag-and-drop abilities to construct, test and deploy predictive analysis models — no programming required. Developers can choose from a library of machine learning algorithms and access a gallery that shows examples of real-world applications. Once developers build a model, they can publish it as a web service.

Exam Question 26

Which Google-developed service competes most directly with Gluon?

A. TensorFlow
B. Kaggle
C. Dataflow
D. Tenor
Correct Answer:
A. TensorFlow
Answer Description:
TensorFlow is an open source machine learning software library used to build computational graphs. Developed internally at Google, TensorFlow offers numerous models for object detection, voice recognition, translation and more.

Exam Question 27

Which of these advanced storage capabilities are worth looking into when building an unstructured data storage system at enterprise scale?

A. Automated storage tiering
B. Storage analytics
C. Capacity optimization
D. Smart data protection
E. All of the above
Correct Answer:
E. All of the above
Answer Description:
With advanced storage tiering capabilities, storage systems can learn the performance needs of specific workloads and implement key policies for quality of service, security and cost control. Storage tiering is a key capability for a unified storage infrastructure and many storage management products.

Analytics is a good source of information about storage across the enterprise. It can provide workload-level reporting on storage use, and what-if scenarios that make it easier to plan changes to storage or virtualization. Such information is central to cost control when dealing with unstructured data.

Capacity optimization and copy data management (CDM) are key to limiting the amount of data an enterprise holds on to, reducing costs and management challenges. Data duplication, compression and thin provisioning are all used to optimize capacity at the array level.

Smart data protection products provide scalable, capacity-optimized backup storage. They take advantage of deep metadata and CDM techniques to provide instant data cloning and global recovery.

Exam Question 28

True or false: Flash storage can’t be used for unstructured data storage?

A. True
B. False
Correct Answer:
B. False
Answer Description:
As the cost of flash has come down, it has become a reasonable alternative to hard disk drives (HDDs). For some workloads, it makes sense to use flash to enhance the performance of speed-challenged object storage. Flash-based drives also use less energy and space.

Exam Question 29

When using flash to improve unstructured data storage, which of the following is not the best advice?

A. Don’t worry about your budget because the cost of flash is coming down
B. Know your requirements and options
C. Use capacity management techniques
D. Consider using hybrid or tiered systems
Correct Answer:
A. Don’t worry about your budget because the cost of flash is coming down
Answer Description:
Before opting for flash, review your business requirements, including application performance, availability, capacity and data accessibility. Use this information, along with your budget, to assess the available flash-based SSDs and all-flash and hybrid arrays.

Built-in capacity management is your best bet when considering flash for unstructured data. Look for features such as compression, deduplication and automated tiering. Hybrid systems that use flash for hot data storage and HDDs for cold data may be cost-effective. Systems that use tiers for faster and slower flash and cloud storage can also be good bets.

Exam Question 30

Which is a major challenge involved in managing unstructured data storage?

A. The volume of unstructured data continues to grow at a rapid rate
B. New types of unstructured data regularly appear with new security risks and management headaches
C. Gaining insight and value from stored unstructured data requires new tools and skill sets
D. All of the above
Correct Answer:
D. All of the above
Answer Description:
Businesses are generating massive amounts of unstructured data, and the volume continues to grow at unchecked rates, creating huge challenges. Technologies such as IoT, AI and machine learning are producing new kinds of unstructured data, with unique management and security issues. Unstructured data by itself doesn’t provide value or insight to an organization. Management and analytics tools and data analytics experts are needed to gain value from stored data.