ratings
Working in a handson learning environment, led by Data Science Boot camp expert instructor, students will learn about and explore: Visualize complex multivariable datasets.Train a decision tree machine learning algorithm.
Unlimited Duration
January 25, 2021
20
Data Science Boot camp is a comprehensive set of challenging projects carefully designed to grow your data science skills from novice to master. Veteran data scientist Leonard Apeltsin sets 10 increasingly difficult exercises that test your abilities against the kind of problems you’d encounter in the realworld. As you solve each challenge, you’ll acquire and expand the data science and Python skills you’ll use as a professional data scientist. Ranging from text processing to machine learning, each project comes complete with a unique downloadable data set and a fullyexplained stepbystep solution. Because these projects come from Dr. Apelstin’s vast experience, each solution highlights the most likely failure points along with practical advice for getting past unexpected pitfalls. When you wrap up these 10 awesome exercises, you’ll have a diverse relevant skill set that’s transferable to working in industry.
Course Curriculum

 Sample Space Analysis: An EquationFree Approach for Measuring Uncertainty in Outcomes 00:00:00
 Computing NonTrivial Probabilities 00:00:00
 Computing Probabilities Over Interval Ranges 00:00:00

 Basic Matplotlib Plots 00:00:00
 Plotting CoinFlip Probabilities 00:00:00

 Simulating Random CoinFlips and DiceRolls Using NumPy 00:00:00
 Computing Confidence Intervals Using Histograms and NumPy Arrays 00:00:00
 Leveraging Confidence Intervals to Analyze a Biased Deck of Cards 00:00:00
 Using Permutations to Shuffle Cards 00:00:00

 Overview 00:00:00
 Predicting Red Cards within a Shuffled Deck 00:00:00
 Optimizing Strategies using the Sample Space for a 10Card Deck 00:00:00
 Key Takeaways 00:00:00
 Part 2. Case Study 2: Assessing Online AdClicks for Significance 00:00:00

 Exploring the Relationships between Data and Probability Using SciPy 00:00:00
 Mean as a Measure of Centrality 00:00:00
 Variance as a Measure of Dispersion 00:00:00

 Manipulating the Normal Distribution Using SciPy 00:00:00
 6.2 Determining Mean and Variance of a Population through Random Sampling 00:00:00
 6.3 Making Predictions Using Mean 00:00:00

 Assessing the Divergence Between Sample Mean and Population Mean 00:00:00
 Data Dredging: Coming to False Conclusions through Oversampling 00:00:00
 Bootstrapping with Replacement: Testing a Hypothesis When the Population Variance is Unknown 00:00:00
 Permutation Testing: Comparing Means of Samples when the Population Parameters are Unknown 00:00:00

 Storing Tables Using Basic Python 00:00:00
 Exploring Tables Using Pandas 00:00:00
 Retrieving Table Columns 00:00:00
 Retrieving Table Rows 00:00:00
 Modifying Table Rows and Columns 00:00:00
 Saving and Loading Table Data 00:00:00
 Visualizing Tables Using Seaborn 00:00:00

 Processing the AdClick Table in Pandas 00:00:00
 Computing Pvalues from Differences in Means 00:00:00
 Determining Statistical Significance 00:00:00
 Shades of Blue: A RealLife Cautionary Tale 00:00:00
 Key Takeaways 00:00:00
 Part 3. Case Study 3: Tracking Disease Outbreaks Using News Headlines 00:00:00

 Using Centrality to Discover Clusters 00:00:00
 KMeans: A Clustering Algorithm for Grouping Data into K Central Groups 00:00:00
 Using the Elbow Method 00:00:00
 Using Density to Discover Clusters 00:00:00
 DBSCAN: A Clustering Algorithm for Grouping Data Based on Spatial Density 00:00:00
 Analyzing Clusters Using Pandas 00:00:00

 The GreatCircle Distance: A Metric for Computing Distances Between 2 Global Points 00:00:00
 Plotting Maps Using Base map 00:00:00
 Location Tracking Using GeoNamesCache 00:00:00
 Matching Location Names in Text 00:00:00

 Overview 00:00:00
 Extracting Locations from Headline Data 00:00:00
 Visualizing and Clustering the Extracted Location Data 00:00:00
 Extracting Insights from Location Clusters 00:00:00
 Key Takeaways 00:00:00
 Part 4. Case Study 4: Using Online Job Postings to Improve Your Data Science Resume 00:00:00

 Simple Text Comparison 00:00:00
 Vectorizing Texts Using Word Counts 00:00:00
 Matrix Multiplication for Efficient Similarity Calculation 00:00:00
 Computational Limits of Matrix Multiplication 00:00:00

 Clustering 2D Data in 1Dimension 00:00:00
 Dimension Reduction Using PCA and ScikitLearn 00:00:00
 Clustering 4D Data in 2Dimensions 00:00:00
 Computing Principal Components Without Rotation 00:00:00
 Efficient Dimension Reduction Using SVD and ScikitLearn 00:00:00


 The Structure of HTML Documents 00:00:00
 Parsing HTML using Beautiful Soup 00:00:00
 Downloading and Parsing Online Data 00:00:00

 Overview 00:00:00
 Extracting Skill Requirements from Job Posting Data 00:00:00
 Filtering Jobs by Relevance 00:00:00
 Conclusion 00:00:00
 Key Takeaways 00:00:00

Course Reviews
Students