Intro to Python for Data Science
This is a college-level introductory Python course geared towards Data Analytics and Data Science applications. Trainees learn Python by solving programming problems of gradually increasing complexity ranging from simple calculations, working with text strings, loops, conditions, variables, and functions to file operations and data visualization.
Note: This course is only available as part of NCLab’s Data Analyst Career Training program.
Course Features
- Trainees learn at their own pace by reading tutorials, watching videos, going through examples, and solving programming challenges.
- Every short lesson is followed by self-assessment, so that trainees instantly know whether they have mastered the concept.
- Trainees obtain real-time help from the NCLab AI tutorial engine, as well as remote assistance from live course instructors as needed.
- Trainees learn how to use powerful Python libraries including Matplotlib, Numpy and Scipy.
- An interactive Python coding app allows trainees to create portfolio artifacts and easily share them online.
Prerequisites
This is not an introductory computer programming course. Trainees should be familiar with basic concepts of computer programming including syntax, counting and conditional loops, conditions, local and global variables, functions, and recursion. For complete beginners who have little or no computer programming, NCLab provides an excellent introductory course Introduction to Computer Programming as part of the training program.
Student Learning Outcomes (SLO)
- Use Python for calculations.
- Use loops and conditions, define and use custom functions.
- Define basic built-in data types.
- Explain the difference between local and global scopes and variables.
- Work with tuples, lists, dictionaries, and sets.
- Use the break and continue statements in loops.
- Use the else branch with for and while loops.
- Work with text strings and regular expressions, ASCII table.
- Identify characteristics of mutable and immutable data types, locate objects in memory.
- Use the Matplotlib and Numpy libraries.
- Plot graphs of functions and parametric curves.
- Examine basic principles of software design.
- Compare shallow and deep copying.
- Read from and write to text files.
- Use assertions and exceptions.
- Visualize scientific data using wireframe plots, surface plots, contour plots, and color maps.
Equipment Requirements
Computer, laptop or tablet with Internet access, email, and one of the following browsers:
- Google Chrome
- Mozilla Firefox
- Microsoft Edge
- Safari
Course Structure and Length
This course is self-paced, and trainees practice each skill and concept as they go. Automatic feedback is built into the course for both practice exercises and quizzes. The course is divided into four Units, and each Unit is composed of five Sections. Each Section consists of 7 instructional/practice levels, a quiz, and a master (proficiency) level. Quizzes can be retaken after 12 hours.
Introduction to Python for Data Science is designed to take approximately 80 hours. Since the course is self-paced, the amount of time required to complete the course will vary from trainee to trainee.
Unit 1 (Introduction)
Section 1
- Brief history of Python.Using Python as a powerful scientific calculator.
- Priorities of arithmetic operators and parentheses, integer division.
- Python libraries, the old and new ways of importing them.
- Importing the Fractions library and working with fractions.
- Using the built-in function help().
- Defining numerical (integer and real) variables and text strings.
- Importing Numpy and using its functionality.Displaying results with the built-in function print().
Section 2
- Using the floor division operator //, the modulo operator %, and the power operator **.
- Using the operator // with negative and real numbers.
- Real numbers are not represented exactly in the computer, and why this can lead to problems with the floor division and modulo operators.
- Using the assignment operator = and the comparison operator ==.
- Working with the Boolean values True and False.
- The result of the comparison operator == is either True or False.
- One should never use the operator == to compare real numbers.
- Using the built-in function abs() to calculate the absolute value of numbers.
- The result of the comparison operators <, >, <=, >=, != is either True or False.
- How to reach the limit of the finite computer arithmetic on any computer.
- Using the arithmetic operators +=, -=, *=, /=, //=, %= and **=.
- Working with the most important units of data size including b, KB, MB and GB.
- The difference between KB and kB.
Section 3
- Defining and calling functions.
- Importance of writing docstrings and commenting your code.
- Function parameters vs. arguments.
- Functions returning multiple values.
- Global and local scopes, global and local variables.
- Functions should never change the values of global variables.
- Working with tuples, unpacking them, accessing individual items via indices.
- Parsing tuples one item at a time using the for loop.
- The range() function.Using nested for loops.
Section 4
- Creating empty and non-empty lists.
- Obtaining the length of lists, function len().
- Adding items to lists, methods append() and insert().
- Removing items from lists, methods pop() and remove(), keyword del.
- Adding lists and multiplying them with integers.
- Mutability of lists.
- Parsing lists with the for loop.
- Accessing individual list items via their indices.
- Using the while loop.
- Slicing lists, creating copies and reversed copies of lists via slicing.
- Reversing lists and sorting them, list methods reverse() and sort().
- Reversing lists and sorting them, built-in functions reversed() and sorted().
- Making list and tuple items unique.
Section 5
- Working with Boolean expressions and variables.
- The if, if-else and if-elif-else statements.
- Using the keyword ‘in’ to check if a given item is present in a tuple or list.
- Using the method count() to count occurrences of given items in tuples and lists.
- Using the method index() to obtain positions of given items in tuples and lists.
- Working with the Boolean operators and, or, not.
- Chaining arithmetic comparison operators.
- Generating random numbers.
- Using the break and continue statements in loops.
- Working with infinite while loops.
- Command ‘pass’.
- Using the else branch with for and while loops.
Unit 2 (Working with Text Strings)
Section 6
- Defining text strings, using single and double quotes.
- Problems associated with trailing spaces, function repr().
- Comparing text strings with the == operator.
- Optional parameters ‘sep’ and ‘end’ of the built-in function print().
- Adding text strings and multiplying them with positive integers.
- Updating text string variables with the operators += and *=.
- The PEP8 — Style Guide for Python Code.
Section 7
- Combining single and double quotes in text strings.
- Obtaining the length of text strings, function len().
- Working with the special characters \n, \” and \’.
- Casting numbers to text strings, function str().
- Inserting numbers into text strings.
- Casting text strings to numbers, functions int() and float().
- Displaying the type of variables, function type().
- Checking the type of variables at runtime, function isinstance().
- Using the text string methods lower(), upper() and title().
- Text string methods never change the original text string.
- Cleaning text strings with the methods rstrip(), lstrip() and strip().
- Splitting a text string into a list of words, method split().
- Checking for substrings, keyword ‘in’.
- Making a text search case-insensitive.
- Counting the occurrences of substrings in text strings, method count().
Section 8
- Working with the ASCII table, functions ord() and chr().
- Searching for and replacing substrings in text strings, method replace().
- Zipping two lists and using the for loop to parse them at the same time.
- Erasing parts of text strings.
- Cleaning text strings from unwanted characters.
- Swapping the contents of two text strings.
- Swapping two substrings in a text string.
- Working with useful text string methods such as isalpha(), isalnum(), isdigit() etc.
Section 9
- Text strings are immutable objects in Python.
- Obtaining the memory address of Python objects, function id().
- Accessing individual characters in text strings via their indices.
- Slicing text strings and reversing them.
- Retrieving system date and time.
- Obtaining the position of a substring in a given text string, method index().
- Counting the occurrences of a substring in a given text string, method count().
- Translating decimal numbers into binary format, function bin().
- Understanding how text strings are represented in computer memory.
- Comparing text strings using the operators <, <=, >, >=.
- Creating text characters which are not present on the keyboard.
Section 10
- What are regular expressions and what are they useful for.
- Python’s regular expressions module ‘re’.
- Using the functions search(), match() and findall().
- Greedy and non-greedy repeating patterns.
- Using character classes and groups of characters.
- Working with the most important metacharacters and special sequences.
- Mining unknown file names and email addresses from text data.
Unit 3 (Plotting, Drawing, and Software Design)
Section 11
- Importing the Matplotlib and Numpy libraries and abbreviating their names.
- Defining lines and polylines using X and Y arrays.
- Plotting polylines, function plot().
- Assigning colors to objects.
- Displaying plots, function show().
- Displaying two or more objects simultaneously.
- Making both axes equally-scaled with axis(“equal”).
- Hiding axes with axis(“off”).
- Filling closed areas with color, function fill().
- Changing the width of lines via the optional keyword argument ‘linewidth’.
- Interrupting polylines with the keyword ‘None’.
- Reversing the orientation of polylines, and drawing hollow objects.
Section 12
- Using the Numpy function linspace() to create equidistant grids.
- Using arrays created with linspace() in calculations.
- Plotting graphs of functions with a linspace() array as the X variable.
- Drawing circles centered at (Cx, Cy), formula x = Cx + R*cos(t), y = Cy + R*sin(t).
- Drawing regular polygons by reducing the number of edges of the circle.
- Drawing circular arcs.
- Drawing ellipses, formula x = Cx + Rx*cos(t), y = Cy + Ry*sin(t).
- Drawing spirals, formula x = Cx + t*cos(t), y = Cx + t*sin(t).
- Casting numpy.ndarray to a list and alter it when needed.
Section 13
- Working with 2D arrays using nested for loops.
- Using matrix-style indices for items in 2D arrays.
- Defining and using functions with default parameter values.
- Setting X and Y axis ranges and adding titles to Matplotlib plots.
- Accessing items in linspace() arrays via their indices.
Section 14
- Why is it important to plan a software very carefully before starting to code.
- API = Application Programming Interface.
- Why should internal data structures never be exposed to the user.
- Designing an API to sustain internal software changes.
- Coding numerous basic shapes including lines, polylines, squares, triangles, quads, rectangles, polygons, circles, arcs, and rings.
- Working with three types of list comprehension.
- Creating empty drawings and adding shapes to them.
- Rotating, moving and scaling shapes, merging them, and reversing their orientation.
Section 15
- Why good software should be organized like an army.
- Why the Graphics Editor should provide functions to work with drawings as opposed to working with individual shapes.
- Using list comprehension to move, rotate and scale objects.
- How to NOT duplicate lists.
- Shallow copy and a deep copying.
- The meaning of a “wrapper”.
Unit 4 (Files, Data, and Visualization)
Section 16
- The old and new ways to open a file.
- Opening a text file for reading with the with statement.
- Parsing a text file line-by-line using the for loop.
- Cleaning text strings with strip(), lstrip() and rstrip().
- Counting lines, words and characters in a text file.
- Working with the file pointer, methods read(), seek() and tell().
- Rewinding a file and when this can be useful.
- Reading selected lines, method readline().
- Working with sets, understanding the differences between sets and lists.
- Creating empty and non-empty sets.
- Adding elements to sets and removing elements.
- Checking the number of items in a set.
- Checking for the presence of an item in a set.
- Checking for subsets and supersets.
- Creating set unions, intersections, and differences.
- Using sets to extract unique words from a text file.
- Using sets to remove duplicate items from lists.
Section 17
- ARPANET, the first version of the Internet, and ASCII art.
- Opening a text file for writing and writing text strings to it.
- Using the file flags ‘w+’, ‘r+’ and ‘a+’.
- Potential risks related to writing to a text file.
- Catching IOError exceptions, the try-except statement.
- Other types of exceptions in Python, and where to find a complete list.
- Extracting all lines from a text file at once as a list of text strings.
- Writing a list of text strings to a text file at once.
- Reading the whole text file into a text string.
- The importance of always checking user data.
- Using assertions and exceptions.
- Escaping the backslash character ‘\’ as ‘\\’.
Section 18
- Bitmap (raster) and vector images.
- PBM (portable bitmap), PGM (portable grey map) and PPM (portable pixmap) images and why they are useful.
- The structure of PBM, PGM and PPM image files.
- Leaving out comments while reading a text file.
- Reading a sequence of numbers from a file and converting it into a 2D array.
- Working with 2D and 3D Numpy arrays, nested loops and indices.
- Writing image files to disk.
- Uploading custom image and data files to NCLab.
- Creating image viewers for PBM, PGM and PPM images based on 2D and 3D Numpy arrays.
Section 19
- Creating empty and non-empty dictionaries.
- Dictionaries are formed by key:value pairs.
- Keys are unique but values can be repeated.
- Adding and removing items, accessing values using keys.
- Parsing a dictionary using a for loop.
- Extracting the lists of keys, values, and items.
- Zipping the lists of keys and values to create a dictionary.
- Reversing a dictionary using comprehension.
- Combining dictionaries and finding keys which correspond to repeated values.
- The mutability of the dictionary object in Python.
Section 20
- Visualizing data obtained from measurements and computations.
- Using CSV and other data formats.
- Using Numpy, Matplotlib, and the Matplotlib’s mplot3d toolkit.
- Displaying measurement data using graphs and bar charts.
- Displaying percentages using pie charts.
- Displaying graphs of functions of two variables.
- Displaying 2D measurement data on structured grids using wireframe plots, surface plots, contour plots, and color maps.
- Displaying scientific data computed on unstructured triangular grids.
- Displaying 2D data represented as 2D Numpy arrays.
- Visualizing MRI data of the human brain.