210 Exam Questions for MLS-C01 Updated Versions With Test Engine [Q56-Q74]

Share

210 Exam Questions for MLS-C01 Updated Versions With Test Engine

Pass MLS-C01 Exam with Updated MLS-C01 Exam Dumps PDF 2023

NEW QUESTION 56
A Data Scientist is building a model to predict customer churn using a dataset of 100 continuous numerical features. The Marketing team has not provided any insight about which features are relevant for churn prediction. The Marketing team wants to interpret the model and see the direct impact of relevant features on the model outcome. While training a logistic regression model, the Data Scientist observes that there is a wide gap between the training and validation set accuracy.
Which methods can the Data Scientist use to improve the model performance and satisfy the Marketing team's needs? (Choose two.)

  • A. Perform t-distributed stochastic neighbor embedding (t-SNE)
  • B. Perform recursive feature elimination
  • C. Perform linear discriminant analysis
  • D. Add L1 regularization to the classifier
  • E. Add features to the dataset

Answer: C,E

 

NEW QUESTION 57
A Data Scientist needs to migrate an existing on-premises ETL process to the cloud. The current process runs at regular time intervals and uses PySpark to combine and format multiple large data sources into a single consolidated output for downstream processing.
The Data Scientist has been given the following requirements to the cloud solution:
- Combine multiple data sources.
- Reuse existing PySpark logic.
- Run the solution on the existing schedule.
- Minimize the number of servers that will need to be managed.
Which architecture should the Data Scientist use to build this solution?

  • A. Write the raw data to Amazon S3. Schedule an AWS Lambda function to run on the existing schedule and process the input data from Amazon S3. Write the Lambda logic in Python and implement the existing PySpark logic to perform the ETL process. Have the Lambda function output the results to a "processed" location in Amazon S3 that is accessible for downstream use.
  • B. Write the raw data to Amazon S3. Create an AWS Glue ETL job to perform the ETL processing against the input data. Write the ETL job in PySpark to leverage the existing logic. Create a new AWS Glue trigger to trigger the ETL job based on the existing schedule. Configure the output target of the ETL job to write to a "processed" location in Amazon S3 that is accessible for downstream use.
  • C. Write the raw data to Amazon S3. Schedule an AWS Lambda function to submit a Spark step to a persistent Amazon EMR cluster based on the existing schedule. Use the existing PySpark logic to run the ETL job on the EMR cluster. Output the results to a "processed" location in Amazon S3 that is accessible for downstream use.
  • D. Use Amazon Kinesis Data Analytics to stream the input data and perform real-time SQL queries against the stream to carry out the required transformations within the stream. Deliver the output results to a "processed" location in Amazon S3 that is accessible for downstream use.

Answer: B

Explanation:
Kinesis Data Analytics can not directly stream the input data.

 

NEW QUESTION 58
A Machine Learning Specialist working for an online fashion company wants to build a data ingestion solution for the company's Amazon S3-based data lake.
The Specialist wants to create a set of ingestion mechanisms that will enable future capabilities comprised of:
- Real-time analytics
- Interactive analytics of historical data
- Clickstream analytics
- Product recommendations
Which services should the Specialist use?

  • A. Amazon Athena as the data catalog: Amazon Kinesis Data Streams and Amazon Kinesis Data Analytics for near-real-time data insights; Amazon Kinesis Data Firehose for clickstream analytics; AWS Glue to generate personalized product recommendations
  • B. Amazon Athena as the data catalog; Amazon Kinesis Data Streams and Amazon Kinesis Data Analytics for historical data insights; Amazon DynamoDB streams for clickstream analytics; AWS Glue to generate personalized product recommendations
  • C. AWS Glue as the data catalog; Amazon Kinesis Data Streams and Amazon Kinesis Data Analytics for real-time data insights; Amazon Kinesis Data Firehose for delivery to Amazon ES for clickstream analytics; Amazon EMR to generate personalized product recommendations
  • D. AWS Glue as the data catalog; Amazon Kinesis Data Streams and Amazon Kinesis Data Analytics for historical data insights; Amazon Kinesis Data Firehose for delivery to Amazon ES for clickstream analytics; Amazon EMR to generate personalized product recommendations

Answer: C

 

NEW QUESTION 59
A Machine Learning Specialist has built a model using Amazon SageMaker built-in algorithms and is not getting expected accurate results The Specialist wants to use hyperparameter optimization to increase the model's accuracy Which method is the MOST repeatable and requires the LEAST amount of effort to achieve this?

  • A. Create an AWS Step Functions workflow that monitors the accuracy in Amazon CloudWatch Logs and relaunches the training job with a defined list of hyperparameters
  • B. Create a random walk in the parameter space to iterate through a range of values that should be used for each individual hyperparameter
  • C. Launch multiple training jobs in parallel with different hyperparameters
  • D. Create a hyperparameter tuning job and set the accuracy as an objective metric.

Answer: A

 

NEW QUESTION 60
An online reseller has a large, multi-column dataset with one column missing 30% of its data A Machine Learning Specialist believes that certain columns in the dataset could be used to reconstruct the missing data Which reconstruction approach should the Specialist use to preserve the integrity of the dataset1?

  • A. Listwise deletion
  • B. Multiple imputation
  • C. Mean substitution
  • D. Last observation carried forward

Answer: A

 

NEW QUESTION 61
A telecommunications company is developing a mobile app for its customers. The company is using an Amazon SageMaker hosted endpoint for machine learning model inferences.
Developers want to introduce a new version of the model for a limited number of users who subscribed to a preview feature of the app. After the new version of the model is tested as a preview, developers will evaluate its accuracy. If a new version of the model has better accuracy, developers need to be able to gradually release the new version for all users over a fixed period of time.
How can the company implement the testing model with the LEAST amount of operational overhead?

  • A. Configure two SageMaker hosted endpoints that serve the different versions of the model. Create an Application Load Balancer (ALB) to route traffic to both endpoints based on the TargetVariant query string parameter. Reconfigure the app to send the TargetVariant query string parameter for users who subscribed to the preview feature. When the new version of the model is ready for release, change the ALB's routing algorithm to weighted until all users have the updated version.
  • B. Update the ProductionVariant data type with the new version of the model by using the CreateEndpointConfig operation with the InitialVariantWeight parameter set to 0. Specify the TargetVariant parameter for InvokeEndpoint calls for users who subscribed to the preview feature. When the new version of the model is ready for release, gradually increase InitialVariantWeight until all users have the updated version.
  • C. Configure two SageMaker hosted endpoints that serve the different versions of the model. Create an Amazon Route 53 record that is configured with a simple routing policy and that points to the current version of the model. Configure the mobile app to use the endpoint URL for users who subscribed to the preview feature and to use the Route 53 record for other users. When the new version of the model is ready for release, add a new model version endpoint to Route 53, and switch the policy to weighted until all users have the updated version.
  • D. Update the DesiredWeightsAndCapacity data type with the new version of the model by using the UpdateEndpointWeightsAndCapacities operation with the DesiredWeight parameter set to 0. Specify the TargetVariant parameter for InvokeEndpoint calls for users who subscribed to the preview feature. When the new version of the model is ready for release, gradually increase DesiredWeight until all users have the updated version.

Answer: C

 

NEW QUESTION 62
A company has set up and deployed its machine learning (ML) model into production with an endpoint using Amazon SageMaker hosting services. The ML team has configured automatic scaling for its SageMaker instances to support workload changes. During testing, the team notices that additional instances are being launched before the new instances are ready. This behavior needs to change as soon as possible.
How can the ML team solve this issue?

  • A. Decrease the cooldown period for the scale-in activity. Increase the configured maximum capacity of instances.
  • B. Increase the cooldown period for the scale-out activity.
  • C. Replace the current endpoint with a multi-model endpoint using SageMaker.
  • D. Set up Amazon API Gateway and AWS Lambda to trigger the SageMaker inference endpoint.

Answer: A

 

NEW QUESTION 63
A retail company is using Amazon Personalize to provide personalized product recommendations for its customers during a marketing campaign. The company sees a significant increase in sales of recommended items to existing customers immediately after deploying a new solution version, but these sales decrease a short time after deployment. Only historical data from before the marketing campaign is available for training.
How should a data scientist adjust the solution?

  • A. Add event type and event value fields to the interactions dataset in Amazon Personalize.
  • B. Use the event tracker in Amazon Personalize to include real-time user interactions.
  • C. Add user metadata and use the HRNN-Metadata recipe in Amazon Personalize.
  • D. Implement a new solution using the built-in factorization machines (FM) algorithm in Amazon SageMaker.

Answer: B

 

NEW QUESTION 64
A manufacturing company asks its Machine Learning Specialist to develop a model that classifies defective parts into one of eight defect types. The company has provided roughly 100000 images per defect type for training During the injial training of the image classification model the Specialist notices that the validation accuracy is 80%, while the training accuracy is 90% It is known that human-level performance for this type of image classification is around 90% What should the Specialist consider to fix this issue1?

  • A. Using a different optimizer
  • B. Making the network larger
  • C. Using some form of regularization
  • D. A longer training time

Answer: C

 

NEW QUESTION 65
A Machine Learning Specialist needs to be able to ingest streaming data and store it in Apache Parquet files for exploration and analysis. Which of the following services would both ingest and store this data in the correct format?

  • A. Amazon Kinesis Data Firehose
  • B. AWSDMS
  • C. Amazon Kinesis Data Streams
  • D. Amazon Kinesis Data Analytics

Answer: C

 

NEW QUESTION 66
A Machine Learning Specialist is configuring Amazon SageMaker so multiple Data Scientists can access notebooks, train models, and deploy endpoints. To ensure the best operational performance, the Specialist needs to be able to track how often the Scientists are deploying models, GPU and CPU utilization on the deployed SageMaker endpoints, and all errors that are generated when an endpoint is invoked.
Which services are integrated with Amazon SageMaker to track this information? (Select TWO.)

  • A. AWS Config
  • B. AWS CloudTrail
  • C. AWS Health
  • D. Amazon CloudWatch
  • E. AWS Trusted Advisor

Answer: B,C

 

NEW QUESTION 67
A manufacturing company has structured and unstructured data stored in an Amazon S3 bucket. A Machine Learning Specialist wants to use SQL to run queries on this data.
Which solution requires the LEAST effort to be able to query this data?

  • A. Use AWS Lambda to transform the data and Amazon Kinesis Data Analytics to run queries.
  • B. Use AWS Data Pipeline to transform the data and Amazon RDS to run queries.
  • C. Use AWS Batch to run ETL on the data and Amazon Aurora to run the queries.
  • D. Use AWS Glue to catalogue the data and Amazon Athena to run queries.

Answer: C

 

NEW QUESTION 68
While reviewing the histogram for residuals on regression evaluation data a Machine Learning Specialist notices that the residuals do not form a zero-centered bell shape as shown What does this mean?

  • A. The model might have prediction errors over a range of target values.
  • B. The model is predicting its target values perfectly.
  • C. There are too many variables in the model
  • D. The dataset cannot be accurately represented using the regression model

Answer: B

 

NEW QUESTION 69
A Machine Learning Specialist must build out a process to query a dataset on Amazon S3 using Amazon Athena. The dataset contains more than 800,000 records stored as plaintext CSV files.
Each record contains 200 columns and is approximately 1.5 MB in size. Most queries will span 5 to 10 columns only.
How should the Machine Learning Specialist transform the dataset to minimize query runtime?

  • A. Convert the records to GZIP CSV format.
  • B. Convert the records to XML format.
  • C. Convert the records to JSON format.
  • D. Convert the records to Apache Parquet format.

Answer: D

Explanation:
Using compressions will reduce the amount of data scanned by Amazon Athena, and also reduce your S3 bucket storage. It's a Win-Win for your AWS bill. Supported formats: GZIP, LZO, SNAPPY (Parquet) and ZLIB.
https://www.cloudforecast.io/blog/using-parquet-on-athena-to-save-money-on-aws/

 

NEW QUESTION 70
A large consumer goods manufacturer has the following products on sale
* 34 different toothpaste variants
* 48 different toothbrush variants
* 43 different mouthwash variants
The entire sales history of all these products is available in Amazon S3 Currently, the company is using custom-built autoregressive integrated moving average (ARIMA) models to forecast demand for these products The company wants to predict the demand for a new product that will soon be launched Which solution should a Machine Learning Specialist apply?

  • A. Train an Amazon SageMaker DeepAR algorithm to forecast demand for the new product
  • B. Train a custom ARIMA model to forecast demand for the new product.
  • C. Train an Amazon SageMaker k-means clustering algorithm to forecast demand for the new product.
  • D. Train a custom XGBoost model to forecast demand for the new product

Answer: B

 

NEW QUESTION 71
A Data Scientist is evaluating different binary classification models. A false positive result is 5 times more expensive (from a business perspective) than a false negative result.
The models should be evaluated based on the following criteria:
1) Must have a recall rate of at least 80%
2) Must have a false positive rate of 10% or less
3) Must minimize business costs
After creating each binary classification model, the Data Scientist generates the corresponding confusion matrix.
Which confusion matrix represents the model that satisfies the requirements?

  • A. TN = 98, FP = 2
    FN = 18, TP = 82
  • B. TN = 96, FP = 4
    FN = 10, TP = 90
  • C. TN = 91, FP = 9
    FN = 22, TP = 78
  • D. TN = 99, FP = 1
    FN = 21, TP = 79

Answer: A

Explanation:
The following calculations are required:
TP = True Positive
FP = False Positive
FN = False Negative
TN = True Negative
FN = False Negative
Recall = TP / (TP + FN)
False Positive Rate (FPR) = FP / (FP + TN)
Cost = 5 * FP + FN

Options C and D have a recall greater than 80% and an FPR less than 10%, but D is the most cost effective.

 

NEW QUESTION 72
Machine Learning Specialist is building a model to predict future employment rates based on a wide range of economic factors. While exploring the data, the Specialist notices that the magnitude of the input features vary greatly. The Specialist does not want variables with a larger magnitude to dominate the model.
What should the Specialist do to prepare the data for model training?

  • A. Apply normalization to ensure each field will have a mean of 0 and a variance of 1 to remove any significant magnitude.
  • B. Apply the orthogonal sparse bigram (OSB) transformation to apply a fixed-size sliding window to generate new features of a similar magnitude.
  • C. Apply quantile binning to group the data into categorical bins to keep any relationships in the data by replacing the magnitude with distribution.
  • D. Apply the Cartesian product transformation to create new combinations of fields that are independent of the magnitude.

Answer: A

Explanation:
https://docs.aws.amazon.com/machine-learning/latest/dg/data-transformations-reference.html

 

NEW QUESTION 73
A Data Science team within a large company uses Amazon SageMaker notebooks to access data stored in Amazon S3 buckets. The IT Security team is concerned that internet-enabled notebook instances create a security vulnerability where malicious code running on the instances could compromise data privacy. The company mandates that all instances stay within a secured VPC with no internet access, and data communication traffic must stay within the AWS network.
How should the Data Science team configure the notebook instance placement to meet these requirements?

  • A. Associate the Amazon SageMaker notebook with a private subnet in a VPC. Ensure the VPC has a NAT gateway and an associated security group allowing only outbound connections to Amazon S3 and Amazon SageMaker.
  • B. Associate the Amazon SageMaker notebook with a private subnet in a VPC. Place the Amazon SageMaker endpoint and S3 buckets within the same VPC.
  • C. Associate the Amazon SageMaker notebook with a private subnet in a VPC. Ensure the VPC has S3 VPC endpoints and Amazon SageMaker VPC endpoints attached to it.
  • D. Associate the Amazon SageMaker notebook with a private subnet in a VPC. Use IAM policies to grant access to Amazon S3 and Amazon SageMaker.

Answer: C

Explanation:
We must use the VPC endpoint (either Gateway Endpoint or Interface Endpoint)to comply with this requirement "Data communication traffic must stay within the AWS network".
https://docs.aws.amazon.com/sagemaker/latest/dg/notebook-interface-endpoint.html

 

NEW QUESTION 74
......

MLS-C01 Exam Dumps - Free Demo & 365 Day Updates: https://www.dumpsking.com/MLS-C01-testking-dumps.html

Free Sales Ending Soon - Use Real MLS-C01 PDF Questions: https://drive.google.com/open?id=1b7eyrnFGASM74yC5OgVBNVKZCTv6G8fb