Posted: March 8th, 2022

Create a jupyter notebook to perform all your work in.

2. Pick any business-related data set from Kaggle. Describe in 1 paragraph the dataset

and the “business use” you hope to use it for.

3. Using pandas and numpy, wrangle the data to make it manageable.

4. Using k-means and Hierarchical clustering, run clustering on the data set. Describe the

results, meaning, and what you learned from the data.

5. Use 5-fold cross-validation and perform classification on the data set using

DecisionTrees and Random Forest. Describe the results, what you learned about the

data, and how well your classifications performed in testing.

6. Repeat problem 5 but this time perform regression on your data using SVM and

XgBoost. Again describe your results, what you learned about the data, and how

accurate regression was.

7. Using the python visualization library of your choice, create relevant visualizations for the

above tasks and to otherwise help interpret your data. Provide some functional

descriptions of your visualization and what can be deduced from them.

8. Submit all of the above in one jupyter notebook along with your datasets

please also upload the dataset you choose from kaggle

Place an order in 3 easy steps. Takes less than 5 mins.